AI Foundations
Module 01
The Story Begins
Meet ThreadCo: ThreadCo sells custom-printed T-shirts online. They have 3 staff, 2,000 products, and 500 customer emails per week -- most asking "where is my order?" or "can I get this in blue?" The founder, Maya, wants to use AI to handle the repetitive work. Before building anything, the team needs to understand what generative AI can actually do for a small e-commerce business.

What is Generative AI?

Generative AI refers to machine learning systems that learn patterns from data and generate new content -- text, code, images, audio -- that did not exist before. Understanding the category before diving into specific tools is essential for any practitioner.

A Brief History of Generative AI

Generative AI did not appear overnight. The field evolved over decades, building on foundational breakthroughs in statistics, neural networks, and computing hardware.

YearMilestoneSignificance
1966ELIZA chatbotFirst natural language processing program; used pattern matching, not learning
1997LSTM networksLong Short-Term Memory solved the vanishing gradient problem, enabling sequence modeling
2014GANs introducedGenerative Adversarial Networks (Goodfellow et al.) could generate realistic images for the first time
2017"Attention Is All You Need"The Transformer architecture replaced recurrence with self-attention, enabling massive parallelism
2018GPT-1 and BERTDemonstrated that pre-training on large text corpora then fine-tuning produced powerful language models
2020GPT-3 (175B params)Showed that scale alone could unlock emergent capabilities like few-shot learning
2022ChatGPT launchMade LLMs accessible to the general public; reached 100 million users in two months
2023GPT-4, Claude 2, Llama 2Multimodal capabilities, open-weight models, and Constitutional AI matured the field
2024-25Agentic AI and reasoning modelsModels gained tool use, long-context windows (1M+ tokens), and chain-of-thought reasoning

Traditional AI vs Generative AI

The distinction matters because it determines what problems a system can solve and how you evaluate its outputs.

Traditional / Discriminative AI

What it does: Classifies inputs into predefined categories or predicts numeric values. Examples: spam filters, fraud detection, demand forecasting.

Output: A label or number from a fixed set. "This email is 94% likely to be spam."

Evaluation: Accuracy, precision, recall, F1 -- all well-defined metrics with ground truth.

Training data: Labelled examples (input-output pairs). Requires curated datasets.

Generative AI

What it does: Produces open-ended, novel content -- text, images, code, audio, video -- that did not exist in the training data.

Output: Free-form content. "Here is a 500-word product description for your new T-shirt line."

Evaluation: Subjective and multi-dimensional: fluency, accuracy, helpfulness, harmlessness. No single ground truth.

Training data: Massive unlabelled corpora (the internet, books, code repositories). Self-supervised learning.

i
Key Insight

Traditional AI answers "which category?" Generative AI answers "what should I create?" Both are valuable -- and many real-world systems combine them. For example, a customer service pipeline might use a classifier to route tickets (traditional AI) and a generative model to draft the response (generative AI).

Core Concepts

Probability Distributions

Generative AI models are trained to model a probability distribution over data. At inference time, they sample from that distribution to produce novel outputs that resemble -- but are not copies of -- their training data. This is why outputs are non-deterministic: each generation is a fresh sample.

Key Model Families

Large Language Models (LLMs) for text and code. Diffusion models for images and video. Multimodal models that cross modalities. Each uses different architectures but shares the generative training paradigm. Knowing which family to use for a given task is one of the first decisions you make.

Emergent Capabilities

As models scale in size and training data, they develop unexpected abilities not explicitly programmed: in-context learning, chain-of-thought reasoning, code generation, and multilingual transfer. These emergent capabilities are why larger models often feel qualitatively different, not just incrementally better.

Where Value is Created

Generative AI creates value by accelerating human tasks that previously required scarce expertise: writing, coding, analysis, summarisation, planning. The economic impact is in throughput and access, not replacement of judgment. A single analyst with an LLM can do the research work that previously required a team.

Types of Generative Models

Not all generative models work the same way. Understanding the main architectures helps you choose the right tool.

ArchitectureHow It GeneratesBest ForExamples
Autoregressive (Transformers)Predicts one token at a time, left-to-right, conditioned on all previous tokensText, code, structured dataGPT-4o, Claude, Llama, Gemini
Diffusion ModelsStarts from random noise and iteratively denoises toward a coherent outputImages, video, audio, 3D objectsDALL-E 3, Midjourney, Stable Diffusion, Sora
GANsTwo networks (generator + discriminator) compete, improving each otherPhotorealistic images, style transferStyleGAN, BigGAN
VAEsEncode inputs into a latent space, then decode to generate variationsData augmentation, anomaly detectionVQ-VAE, DALL-E 1
Multimodal ModelsAccept and produce multiple modalities (text, image, audio) in a single modelCross-modal tasks, visual Q&A, document understandingGPT-4o, Gemini 1.5, Claude (vision)

The Generative AI Landscape

Model Types and Their Primary Use Cases
LARGE LANGUAGE MODELS Text generation Code completion Reasoning Q&A / Chat GPT-4o, Gemini, Llama DIFFUSION MODELS Image generation Video synthesis Audio creation Image editing DALL-E, Midjourney CODE MODELS Autocomplete Agentic coding Test generation Refactoring Windsurf, Cursor MULTIMODAL MODELS Vision + text Audio + text Cross-modal Any-to-any GPT-4o, Gemini 1.5, Llama 3

Real-World Applications Across Industries

Generative AI is not confined to chatbots. It is being deployed across virtually every industry. Here are concrete examples of how organisations are using it today.

Healthcare

Clinical documentation: LLMs transcribe and summarise patient encounters, reducing physician documentation burden by 40-60%. Drug discovery: Generative models propose novel molecular structures, compressing early-stage research timelines from years to months. Radiology: Multimodal models assist in interpreting medical images, flagging anomalies for radiologist review.

Financial Services

Report generation: Analysts use LLMs to draft earnings summaries, risk assessments, and compliance reports. Code modernisation: Banks use code-generation models to migrate legacy COBOL systems to modern languages. Customer service: AI agents handle routine inquiries about account balances, transactions, and card disputes.

Software Engineering

Code completion: AI-assisted IDEs (Windsurf, Cursor, GitHub Copilot) accelerate development by 30-55% in studies. Test generation: Models write unit and integration tests from function signatures. Code review: LLMs identify bugs, security vulnerabilities, and style violations in pull requests.

Marketing & E-Commerce

Content at scale: Product descriptions, ad copy, email campaigns, and social media posts generated in seconds. Personalisation: Models tailor messaging to customer segments based on purchase history and preferences. Image generation: Product mockups and lifestyle imagery created without photoshoots.

Limitations You Must Understand

Generative AI is powerful but not magical. Knowing its limitations is as important as knowing its capabilities.

!
Hallucination

Models generate plausible-sounding content by predicting likely next tokens -- they have no concept of truth. They can confidently fabricate facts, citations, statistics, and code that looks correct but is not. Always verify outputs for any decision that matters.

!
Knowledge Cutoff

Models only know what was in their training data. They have no awareness of events after their cutoff date unless given tools (web search, RAG) to access current information. Asking a model about yesterday's news without tools will produce either an honest "I don't know" or a hallucinated answer.

!
No True Understanding

LLMs manipulate statistical patterns in language -- they do not "understand" concepts the way humans do. They cannot reason from first principles in domains far outside their training distribution. They excel at tasks where pattern matching over vast data is sufficient, but struggle with novel logical puzzles or precise mathematical computation.

LimitationImpactMitigation
HallucinationFalse information presented confidentlyRAG, citations, human review
Knowledge cutoffNo awareness of recent eventsWeb search tools, retrieval-augmented generation
Context window limitsCannot process arbitrarily long documentsChunking, summarisation, hierarchical processing
BiasReflects and amplifies training data biasesDiverse test sets, bias audits, guardrails
Cost at scaleAPI costs grow with usage volumeModel tiering (use smaller models for simple tasks), caching, batching
Non-determinismSame prompt can produce different outputsTemperature=0, seed parameters, structured output formats

ThreadCo Application: Where Does Gen AI Fit?

Returning to our story, Maya maps ThreadCo's pain points to generative AI capabilities:

Pain PointAI SolutionModel TypeRisk Level
Writing 2,000 product descriptionsLLM generates descriptions from product attributesText generationLow (human can review batch)
500 "where is my order?" emails/weekAgent queries order DB and drafts responseLLM + tool useMedium (customer-facing)
Creating social media imagesDiffusion model generates lifestyle product photosImage generationLow (internal review before posting)
Summarising customer reviewsLLM extracts themes and sentiment from review corpusText analysisLow (internal use only)
i
Training Tip

This programme focuses on LLMs and code generation tools. Complete the AI Foundations track first, then choose your tool track based on your role and the technology your organisation has adopted.

Key Terminology Glossary

Before proceeding to the next module, make sure you are comfortable with these terms. They appear throughout the rest of the programme.

TermDefinition
TokenThe basic unit of text that an LLM processes. Roughly 0.75 English words. Models read and generate tokens, not characters or words.
Context windowThe maximum number of tokens a model can process in a single request (input + output combined). Ranges from 4K to 1M+ tokens depending on the model.
InferenceThe process of running a trained model to generate outputs. This is what happens when you send a prompt to an API.
Fine-tuningAdditional training of a pre-trained model on a specific dataset to specialise its behaviour for a particular task or domain.
RAGRetrieval-Augmented Generation. A technique that retrieves relevant documents from a knowledge base and includes them in the model's context to ground its responses in verified data.
Prompt engineeringThe practice of crafting inputs to LLMs to achieve desired outputs. The primary "programming" interface for generative AI.
HallucinationWhen a model generates plausible-sounding but factually incorrect content. A fundamental limitation of all current generative models.
MultimodalA model that can process and/or generate multiple types of content (text, images, audio, video) within a single interaction.

Hands-On Exercises

1
Exercise 1: Classify the AI

For each scenario below, decide whether the task requires traditional (discriminative) AI or generative AI, and explain why:

  • Detecting fraudulent credit card transactions in real time
  • Writing personalised birthday emails to loyalty programme members
  • Sorting incoming support tickets into categories (billing, shipping, returns)
  • Generating alt-text descriptions for product images on a website
  • Predicting which customers are likely to churn next month
2
Exercise 2: Spot the Hallucination

Open any LLM (ChatGPT, Claude, Gemini) and ask it: "Who won the Nobel Prize in Literature in 2019 and what was their most famous work?" Verify the answer against Wikipedia. Then ask about a fictional award -- "Who won the Global Excellence Prize for Digital Innovation in 2023?" -- and observe how the model responds. Write down what you learn about when and why models hallucinate.

3
Exercise 3: Map Your Organisation

List five repetitive tasks in your own team or organisation. For each one, identify: (a) Could generative AI help? (b) Which model type would you use? (c) What is the risk level if the AI produces incorrect output? (d) Would a human need to review every output, or just a sample? Create a simple table like the ThreadCo example above.

4
Exercise 4: Compare Model Families

Pick one task (e.g., "write a marketing email for a new product launch") and try it on at least two different LLMs (e.g., ChatGPT and Claude). Compare the outputs on: tone, accuracy, length, and creativity. Note three specific differences. What does this tell you about choosing models for production use?

5
Exercise 5: Limitations Audit

Choose a real use case from your work. Write a one-page risk assessment covering: (a) Which of the six limitations from the table above apply? (b) How severe is each applicable limitation for this use case? (c) What specific mitigations would you implement? This exercise builds the muscle for the Safety & Ethics module later in the track.