What is Generative AI?
Generative AI refers to machine learning systems that learn patterns from data and generate new content -- text, code, images, audio -- that did not exist before. Understanding the category before diving into specific tools is essential for any practitioner.
A Brief History of Generative AI
Generative AI did not appear overnight. The field evolved over decades, building on foundational breakthroughs in statistics, neural networks, and computing hardware.
| Year | Milestone | Significance |
|---|---|---|
| 1966 | ELIZA chatbot | First natural language processing program; used pattern matching, not learning |
| 1997 | LSTM networks | Long Short-Term Memory solved the vanishing gradient problem, enabling sequence modeling |
| 2014 | GANs introduced | Generative Adversarial Networks (Goodfellow et al.) could generate realistic images for the first time |
| 2017 | "Attention Is All You Need" | The Transformer architecture replaced recurrence with self-attention, enabling massive parallelism |
| 2018 | GPT-1 and BERT | Demonstrated that pre-training on large text corpora then fine-tuning produced powerful language models |
| 2020 | GPT-3 (175B params) | Showed that scale alone could unlock emergent capabilities like few-shot learning |
| 2022 | ChatGPT launch | Made LLMs accessible to the general public; reached 100 million users in two months |
| 2023 | GPT-4, Claude 2, Llama 2 | Multimodal capabilities, open-weight models, and Constitutional AI matured the field |
| 2024-25 | Agentic AI and reasoning models | Models gained tool use, long-context windows (1M+ tokens), and chain-of-thought reasoning |
Traditional AI vs Generative AI
The distinction matters because it determines what problems a system can solve and how you evaluate its outputs.
What it does: Classifies inputs into predefined categories or predicts numeric values. Examples: spam filters, fraud detection, demand forecasting.
Output: A label or number from a fixed set. "This email is 94% likely to be spam."
Evaluation: Accuracy, precision, recall, F1 -- all well-defined metrics with ground truth.
Training data: Labelled examples (input-output pairs). Requires curated datasets.
What it does: Produces open-ended, novel content -- text, images, code, audio, video -- that did not exist in the training data.
Output: Free-form content. "Here is a 500-word product description for your new T-shirt line."
Evaluation: Subjective and multi-dimensional: fluency, accuracy, helpfulness, harmlessness. No single ground truth.
Training data: Massive unlabelled corpora (the internet, books, code repositories). Self-supervised learning.
Traditional AI answers "which category?" Generative AI answers "what should I create?" Both are valuable -- and many real-world systems combine them. For example, a customer service pipeline might use a classifier to route tickets (traditional AI) and a generative model to draft the response (generative AI).
Core Concepts
Probability Distributions
Generative AI models are trained to model a probability distribution over data. At inference time, they sample from that distribution to produce novel outputs that resemble -- but are not copies of -- their training data. This is why outputs are non-deterministic: each generation is a fresh sample.
Key Model Families
Large Language Models (LLMs) for text and code. Diffusion models for images and video. Multimodal models that cross modalities. Each uses different architectures but shares the generative training paradigm. Knowing which family to use for a given task is one of the first decisions you make.
Emergent Capabilities
As models scale in size and training data, they develop unexpected abilities not explicitly programmed: in-context learning, chain-of-thought reasoning, code generation, and multilingual transfer. These emergent capabilities are why larger models often feel qualitatively different, not just incrementally better.
Where Value is Created
Generative AI creates value by accelerating human tasks that previously required scarce expertise: writing, coding, analysis, summarisation, planning. The economic impact is in throughput and access, not replacement of judgment. A single analyst with an LLM can do the research work that previously required a team.
Types of Generative Models
Not all generative models work the same way. Understanding the main architectures helps you choose the right tool.
| Architecture | How It Generates | Best For | Examples |
|---|---|---|---|
| Autoregressive (Transformers) | Predicts one token at a time, left-to-right, conditioned on all previous tokens | Text, code, structured data | GPT-4o, Claude, Llama, Gemini |
| Diffusion Models | Starts from random noise and iteratively denoises toward a coherent output | Images, video, audio, 3D objects | DALL-E 3, Midjourney, Stable Diffusion, Sora |
| GANs | Two networks (generator + discriminator) compete, improving each other | Photorealistic images, style transfer | StyleGAN, BigGAN |
| VAEs | Encode inputs into a latent space, then decode to generate variations | Data augmentation, anomaly detection | VQ-VAE, DALL-E 1 |
| Multimodal Models | Accept and produce multiple modalities (text, image, audio) in a single model | Cross-modal tasks, visual Q&A, document understanding | GPT-4o, Gemini 1.5, Claude (vision) |
The Generative AI Landscape
Real-World Applications Across Industries
Generative AI is not confined to chatbots. It is being deployed across virtually every industry. Here are concrete examples of how organisations are using it today.
Healthcare
Clinical documentation: LLMs transcribe and summarise patient encounters, reducing physician documentation burden by 40-60%. Drug discovery: Generative models propose novel molecular structures, compressing early-stage research timelines from years to months. Radiology: Multimodal models assist in interpreting medical images, flagging anomalies for radiologist review.
Financial Services
Report generation: Analysts use LLMs to draft earnings summaries, risk assessments, and compliance reports. Code modernisation: Banks use code-generation models to migrate legacy COBOL systems to modern languages. Customer service: AI agents handle routine inquiries about account balances, transactions, and card disputes.
Software Engineering
Code completion: AI-assisted IDEs (Windsurf, Cursor, GitHub Copilot) accelerate development by 30-55% in studies. Test generation: Models write unit and integration tests from function signatures. Code review: LLMs identify bugs, security vulnerabilities, and style violations in pull requests.
Marketing & E-Commerce
Content at scale: Product descriptions, ad copy, email campaigns, and social media posts generated in seconds. Personalisation: Models tailor messaging to customer segments based on purchase history and preferences. Image generation: Product mockups and lifestyle imagery created without photoshoots.
Limitations You Must Understand
Generative AI is powerful but not magical. Knowing its limitations is as important as knowing its capabilities.
Models generate plausible-sounding content by predicting likely next tokens -- they have no concept of truth. They can confidently fabricate facts, citations, statistics, and code that looks correct but is not. Always verify outputs for any decision that matters.
Models only know what was in their training data. They have no awareness of events after their cutoff date unless given tools (web search, RAG) to access current information. Asking a model about yesterday's news without tools will produce either an honest "I don't know" or a hallucinated answer.
LLMs manipulate statistical patterns in language -- they do not "understand" concepts the way humans do. They cannot reason from first principles in domains far outside their training distribution. They excel at tasks where pattern matching over vast data is sufficient, but struggle with novel logical puzzles or precise mathematical computation.
| Limitation | Impact | Mitigation |
|---|---|---|
| Hallucination | False information presented confidently | RAG, citations, human review |
| Knowledge cutoff | No awareness of recent events | Web search tools, retrieval-augmented generation |
| Context window limits | Cannot process arbitrarily long documents | Chunking, summarisation, hierarchical processing |
| Bias | Reflects and amplifies training data biases | Diverse test sets, bias audits, guardrails |
| Cost at scale | API costs grow with usage volume | Model tiering (use smaller models for simple tasks), caching, batching |
| Non-determinism | Same prompt can produce different outputs | Temperature=0, seed parameters, structured output formats |
ThreadCo Application: Where Does Gen AI Fit?
Returning to our story, Maya maps ThreadCo's pain points to generative AI capabilities:
| Pain Point | AI Solution | Model Type | Risk Level |
|---|---|---|---|
| Writing 2,000 product descriptions | LLM generates descriptions from product attributes | Text generation | Low (human can review batch) |
| 500 "where is my order?" emails/week | Agent queries order DB and drafts response | LLM + tool use | Medium (customer-facing) |
| Creating social media images | Diffusion model generates lifestyle product photos | Image generation | Low (internal review before posting) |
| Summarising customer reviews | LLM extracts themes and sentiment from review corpus | Text analysis | Low (internal use only) |
This programme focuses on LLMs and code generation tools. Complete the AI Foundations track first, then choose your tool track based on your role and the technology your organisation has adopted.
Key Terminology Glossary
Before proceeding to the next module, make sure you are comfortable with these terms. They appear throughout the rest of the programme.
| Term | Definition |
|---|---|
| Token | The basic unit of text that an LLM processes. Roughly 0.75 English words. Models read and generate tokens, not characters or words. |
| Context window | The maximum number of tokens a model can process in a single request (input + output combined). Ranges from 4K to 1M+ tokens depending on the model. |
| Inference | The process of running a trained model to generate outputs. This is what happens when you send a prompt to an API. |
| Fine-tuning | Additional training of a pre-trained model on a specific dataset to specialise its behaviour for a particular task or domain. |
| RAG | Retrieval-Augmented Generation. A technique that retrieves relevant documents from a knowledge base and includes them in the model's context to ground its responses in verified data. |
| Prompt engineering | The practice of crafting inputs to LLMs to achieve desired outputs. The primary "programming" interface for generative AI. |
| Hallucination | When a model generates plausible-sounding but factually incorrect content. A fundamental limitation of all current generative models. |
| Multimodal | A model that can process and/or generate multiple types of content (text, images, audio, video) within a single interaction. |
Hands-On Exercises
For each scenario below, decide whether the task requires traditional (discriminative) AI or generative AI, and explain why:
- Detecting fraudulent credit card transactions in real time
- Writing personalised birthday emails to loyalty programme members
- Sorting incoming support tickets into categories (billing, shipping, returns)
- Generating alt-text descriptions for product images on a website
- Predicting which customers are likely to churn next month
Open any LLM (ChatGPT, Claude, Gemini) and ask it: "Who won the Nobel Prize in Literature in 2019 and what was their most famous work?" Verify the answer against Wikipedia. Then ask about a fictional award -- "Who won the Global Excellence Prize for Digital Innovation in 2023?" -- and observe how the model responds. Write down what you learn about when and why models hallucinate.
List five repetitive tasks in your own team or organisation. For each one, identify: (a) Could generative AI help? (b) Which model type would you use? (c) What is the risk level if the AI produces incorrect output? (d) Would a human need to review every output, or just a sample? Create a simple table like the ThreadCo example above.
Pick one task (e.g., "write a marketing email for a new product launch") and try it on at least two different LLMs (e.g., ChatGPT and Claude). Compare the outputs on: tone, accuracy, length, and creativity. Note three specific differences. What does this tell you about choosing models for production use?
Choose a real use case from your work. Write a one-page risk assessment covering: (a) Which of the six limitations from the table above apply? (b) How severe is each applicable limitation for this use case? (c) What specific mitigations would you implement? This exercise builds the muscle for the Safety & Ethics module later in the track.