AI Foundations
Module 05
Foundation
Safe Customer Interactions: ShopMate will reply to real customers. A hallucinated delivery date or a promise of a refund the business cannot honour would cause serious problems. The team adds guardrails: ShopMate can only state facts from the order database, never invent information, and always escalates refund requests to a human.

AI Safety & Ethics

AI safety is not a compliance checkbox -- it is a core engineering and organisational discipline. These principles apply regardless of which model or tool you use.

Helpful, Harmless, Honest

The three core alignment objectives shared across all major AI labs. Every responsible AI system tries to balance genuine usefulness against avoiding harm, while maintaining honesty. When these goals conflict, harm avoidance and honesty take precedence.

Bias and Fairness

LLMs inherit biases from their training data. They may produce outputs that reflect historical inequities, stereotype groups, or perform inconsistently across languages and cultures. Always test AI systems on diverse inputs before deployment.

Privacy and Data

Do not send personal data, credentials, or confidential documents to AI models without a clear legal basis and appropriate data processing agreements. Treat AI prompts as potential data flows subject to GDPR, CCPA, and sector-specific regulations.

Human Oversight

AI systems should support human oversight, not undermine it. For high-stakes decisions -- legal, medical, financial -- AI outputs must be reviewed by qualified humans before action. The model's confidence is not a substitute for expert judgment.

Risk CategoryExampleMitigation
HallucinationFabricated legal citations in a briefRAG grounding + human review gate
PII leakageUser pastes customer data into promptPII detection layer + policy training
Bias in outputResume screening that disadvantages groupsDiverse test sets + output audits
Prompt injectionMalicious data in tool result hijacks agentOutput sanitisation + sandboxed execution
Over-relianceDecisions made without human reviewMandatory review gates for high-stakes actions