Tool Design & MCP | AI Training Hub

Section 2 — Tool Design & MCP Integration

Tools are the bridge between Claude's reasoning and the outside world. A well-designed tool turns the model into a capable agent; a poorly designed one turns it into a confused slot machine. This section covers every layer of that bridge — from the description string that guides selection, through JSON Schema validation, structured errors, and the Model Context Protocol that standardises it all.

1. How Claude Selects Tools

When you attach tools to an API call, Claude receives each tool's name, description, and input_schema as part of the system prompt. The model then treats tool selection like any other decision: it reads the available options, reasons about the user's intent, and emits a tool_use content block with the chosen tool name and arguments.

The Description Is the Decision

The tool name gives a hint, but the description does the heavy lifting. Claude weighs the description far more than the name when deciding which tool to call. A tool named search with the description "Queries the product catalogue by SKU or keyword and returns matching items with price and stock status" will be selected precisely. The same tool with the description "searches stuff" will be selected unpredictably.

Key concept: Claude does not have implicit knowledge of your system. It treats every tool description as a self-contained API contract. If your description is ambiguous, the model will guess — and guessing at scale is a production incident.

Ambiguity Causes Misselection

Consider two tools in the same request: get_user ("gets a user") and find_user ("finds a user"). From Claude's perspective these are indistinguishable. It has no way to know that one looks up by ID and the other searches by email. The result is non-deterministic tool selection — sometimes it picks one, sometimes the other — and your application breaks in ways that are difficult to reproduce.

Token Overhead of Many Tools

Every tool definition consumes input tokens. A single tool with a moderately detailed schema runs 200–400 tokens. At 20 tools you are spending 4,000–8,000 tokens before the conversation even starts. At 50 tools this overhead alone may cost more than the user's message. Beyond cost, there is a quality degradation: the more tools present, the harder it is for the model to select the right one. Research and practice converge on 5–7 tools as the sweet spot for a single agent turn.

1–7 tools: High selection accuracy, low overhead. Ideal for focused agents.
8–15 tools: Workable with very distinct descriptions. Test selection accuracy carefully.
16–30 tools: Expect misselection. Consider splitting across multiple agents.
30+ tools: Almost certainly needs a routing layer or MCP server decomposition.

2. Writing Effective Tool Descriptions

A good tool description follows a formula: action verb + what it operates on + input expectations + output format + side effects (if any). Let's compare real-world examples.

Three Bad Descriptions

JSON

// BAD 1: Vague — tells the model nothing actionable
{
  "name": "search",
  "description": "Searches for things"
}

// BAD 2: Implementation-focused instead of behaviour-focused
{
  "name": "db_query",
  "description": "Runs a SQL SELECT on the PostgreSQL users table using pg_pool"
}

// BAD 3: Missing critical information
{
  "name": "send_email",
  "description": "Sends an email"
  // Does it take an address? A template ID? Does it actually send
  // or just queue? What does it return? The model has to guess everything.
}

Three Good Descriptions

JSON

// GOOD 1: Specific action, clear inputs, defined output
{
  "name": "search_products",
  "description": "Searches the product catalogue by keyword or SKU. Accepts a query string and optional category filter. Returns an array of matching products, each with id, name, price (USD cents), and stock_count. Returns an empty array if no matches are found."
}

// GOOD 2: States side effects explicitly
{
  "name": "create_order",
  "description": "Creates a new order for the given customer. Requires customer_id and an array of line items (product_id + quantity). Charges the customer's default payment method immediately. Returns the order object with status 'confirmed' or an error if payment fails. THIS TOOL HAS SIDE EFFECTS: it charges real money."
}

// GOOD 3: Disambiguates from similar tools
{
  "name": "lookup_user_by_id",
  "description": "Retrieves a single user record by their unique numeric ID. Use this when you already have the user's ID. Returns the full user profile including name, email, and role. Returns a not_found error if the ID does not exist. NOTE: To search users by name or email, use search_users instead."
}

Key concept: The "NOTE: To search users by name or email, use search_users instead" pattern is extremely effective. It creates a decision tree in the description itself, steering the model toward the correct tool before it even considers the wrong one.

Description Checklist

Starts with an action verb (searches, creates, retrieves, deletes, calculates)
Names the domain object it operates on (product, order, user, ticket)
States required inputs and their types in plain English
Describes the return value shape and edge cases (empty results, not found)
Flags side effects if the tool mutates state, sends messages, or costs money
Includes disambiguation notes if similar tools exist in the same set

3. JSON Schema Deep Dive

Claude's tool input_schema follows JSON Schema (draft 2020-12 compatible). A precise schema does two things: it guides the model to produce valid arguments, and it lets your application reject malformed calls before they reach business logic. Here is a complete reference of the most useful keywords.

Primitive Types and Constraints

JSON Schema

{
  // String with length and pattern constraints
  "sku": {
    "type": "string",
    "description": "Product SKU in format ABC-12345",
    "pattern": "^[A-Z]{3}-\\d{5}$",
    "minLength": 9,
    "maxLength": 9
  },

  // String with format validation
  "email": {
    "type": "string",
    "description": "Customer email address",
    "format": "email"
  },

  // Date string
  "ship_date": {
    "type": "string",
    "description": "Requested shipping date",
    "format": "date"
  },

  // Number with range
  "quantity": {
    "type": "integer",
    "description": "Number of units to order (1–500)",
    "minimum": 1,
    "maximum": 500
  },

  // Float with exclusive bounds
  "discount_rate": {
    "type": "number",
    "description": "Discount as a decimal (e.g. 0.15 for 15%)",
    "minimum": 0,
    "exclusiveMaximum": 1
  },

  // Boolean
  "express_shipping": {
    "type": "boolean",
    "description": "Whether to use express shipping (2-day)"
  },

  // Enum — restricts to exact values
  "priority": {
    "type": "string",
    "description": "Ticket priority level",
    "enum": ["low", "medium", "high", "critical"]
  },

  // Nullable string — allows null explicitly
  "notes": {
    "type": ["string", "null"],
    "description": "Optional order notes. Pass null if none."
  }
}

Nested Objects and Arrays

JSON Schema

{
  // Nested object with its own required fields
  "shipping_address": {
    "type": "object",
    "description": "Delivery address",
    "properties": {
      "street": { "type": "string" },
      "city":   { "type": "string" },
      "state":  { "type": "string", "pattern": "^[A-Z]{2}$" },
      "zip":    { "type": "string", "pattern": "^\\d{5}(-\\d{4})?$" }
    },
    "required": ["street", "city", "state", "zip"]
  },

  // Array of objects with min/max items
  "line_items": {
    "type": "array",
    "description": "Products to include in the order",
    "minItems": 1,
    "maxItems": 50,
    "items": {
      "type": "object",
      "properties": {
        "product_id": { "type": "integer" },
        "quantity":   { "type": "integer", "minimum": 1 }
      },
      "required": ["product_id", "quantity"]
    }
  }
}

Full Working Example

Here is a complete tool definition combining all the patterns above into a realistic order-creation tool:

JSON

{
  "name": "create_order",
  "description": "Creates and submits a new order for a customer. Charges the customer's payment method on file immediately. Returns the order object with confirmation number and estimated delivery date, or a structured error if payment fails or inventory is insufficient.",
  "input_schema": {
    "type": "object",
    "properties": {
      "customer_id": {
        "type": "integer",
        "description": "Unique customer identifier"
      },
      "line_items": {
        "type": "array",
        "description": "One or more products to order",
        "minItems": 1,
        "maxItems": 50,
        "items": {
          "type": "object",
          "properties": {
            "product_id": { "type": "integer" },
            "quantity":   { "type": "integer", "minimum": 1, "maximum": 500 },
            "gift_note":  { "type": ["string", "null"], "maxLength": 200 }
          },
          "required": ["product_id", "quantity"]
        }
      },
      "shipping_address": {
        "type": "object",
        "properties": {
          "street":  { "type": "string", "maxLength": 200 },
          "city":    { "type": "string", "maxLength": 100 },
          "state":   { "type": "string", "pattern": "^[A-Z]{2}$" },
          "zip":     { "type": "string", "pattern": "^\\d{5}(-\\d{4})?$" },
          "country": { "type": "string", "enum": ["US", "CA", "MX"] }
        },
        "required": ["street", "city", "state", "zip", "country"]
      },
      "priority": {
        "type": "string",
        "enum": ["standard", "express", "overnight"],
        "description": "Shipping speed. Defaults to standard if omitted."
      },
      "coupon_code": {
        "type": ["string", "null"],
        "description": "Optional promotional code. Pass null if none.",
        "pattern": "^[A-Z0-9]{6,12}$"
      }
    },
    "required": ["customer_id", "line_items", "shipping_address"]
  }
}

Key concept: Descriptions on individual properties matter just as much as the top-level tool description. Claude reads them to decide what value to pass. A property named q with no description will get unpredictable values; a property named search_query with a description like "Full-text search string, supports AND/OR operators" will get exactly what you need.

4. Structured Error Responses

When a tool call fails, the information you return determines whether Claude can recover gracefully or spirals into repeated failures. Returning a bare string like "Error: something went wrong" gives the model almost nothing to work with. It cannot distinguish a typo in the input (fixable) from a server outage (not fixable). It cannot decide whether to retry or apologise. Structured errors solve this.

The Structured Error Pattern

TypeScript

interface ToolError {
  error: true;                          // Always true — lets Claude detect errors reliably
  errorCategory:
    | "validation"                      // Bad input from the model
    | "not_found"                       // Resource does not exist
    | "permission"                      // Auth/authz failure
    | "rate_limit"                      // Throttled — try again later
    | "conflict"                        // State conflict (e.g. duplicate)
    | "internal";                       // Unexpected server failure
  isRetryable: boolean;                 // Can the model try the same call again?
  message: string;                      // Human-readable explanation
  retryAfter?: number;                  // Seconds to wait before retrying (rate_limit)
  invalidFields?: Record<string, string>; // Field-level validation errors
}

Bad vs Good Error Examples

JSON

// BAD: Claude has no idea what to do with this
{
  "result": "Error: invalid request"
}

// BAD: Stack trace is noise for the model
{
  "error": "TypeError: Cannot read property 'id' of undefined\n    at Object.handler (/app/src/orders.js:42:15)\n    at ..."
}

// GOOD: Validation error — Claude can fix and retry
{
  "error": true,
  "errorCategory": "validation",
  "isRetryable": true,
  "message": "The shipping address is missing a required field.",
  "invalidFields": {
    "shipping_address.zip": "ZIP code is required and must be 5 or 9 digits"
  }
}

// GOOD: Rate limit — Claude knows to wait
{
  "error": true,
  "errorCategory": "rate_limit",
  "isRetryable": true,
  "message": "API rate limit exceeded. Try again after the specified delay.",
  "retryAfter": 30
}

// GOOD: Not found — Claude can inform the user
{
  "error": true,
  "errorCategory": "not_found",
  "isRetryable": false,
  "message": "No customer found with ID 99421. Verify the customer ID and try again."
}

The isRetryable flag is especially powerful. When Claude sees isRetryable: true with a validation error, it knows to adjust its arguments and call the tool again. When it sees isRetryable: false, it knows to report the failure to the user instead of wasting tokens on futile retries.

Key concept: Structured errors are part of your tool's contract. Design them with the same care you give to success responses. In agentic loops, the difference between a structured error and a string error is the difference between self-healing and infinite retry spirals.

5. MCP Architecture

The Model Context Protocol (MCP) is an open standard created by Anthropic that defines how AI applications discover and interact with external tools, data sources, and prompt templates. Think of it as the USB-C of AI integrations: a single, standardised plug that replaces dozens of bespoke connectors.

Why MCP Exists

Before MCP, every AI application implemented tool calling in its own way. If you built a Slack integration for Claude, that code could not be reused with a different AI host. If a vendor created a database connector, each AI framework needed its own adapter. MCP eliminates this N-times-M problem by defining a single protocol: any MCP-compliant server works with any MCP-compliant client, regardless of who built either side.

Servers and Clients

MCP Server — A lightweight process that exposes tools, resources, and prompts. It could wrap a database, an API, a filesystem, or any external system. It speaks the MCP protocol.
MCP Client — The AI-powered application (like Claude Desktop, Claude Code, or your custom agent) that connects to one or more MCP servers and makes their capabilities available to the model.
MCP Host — The user-facing application that contains the MCP client. A single host may run multiple clients, each connected to a different server.

Transport Protocols

MCP defines two transport mechanisms:

stdio (Standard I/O) — The client spawns the server as a child process and communicates via stdin/stdout. Best for local tools, file-system access, and development. Zero network configuration required.
SSE (Server-Sent Events) / Streamable HTTP — The server runs as an HTTP endpoint. The client connects over the network. Best for remote services, shared servers, and cloud-deployed tools. Supports authentication headers.

Server Lifecycle

An MCP session follows a strict lifecycle:

1. Initialize — The client sends an initialize request with its supported protocol version and capabilities. The server responds with its own version and capabilities.
2. Capability Discovery — The client calls tools/list, resources/list, and prompts/list to discover everything the server offers. Each tool comes with its name, description, and JSON Schema.
3. Operation — The client invokes tools (tools/call), reads resources (resources/read), or fetches prompts (prompts/get) as needed during the conversation.
4. Shutdown — The client sends a shutdown signal. For stdio transport, it typically terminates the child process.

Key concept: Discovery is what makes MCP powerful. The client does not need to know in advance what tools a server offers. It asks, the server answers, and the tools are immediately available to the model. This means you can add new tools to a server and every connected client gets them automatically — no code changes needed on the client side.

6. MCP Primitives: Tools, Resources, and Prompts

MCP defines three distinct primitive types. Understanding the differences is critical because each one has a different control flow — who triggers it and when.

Tools — Model-Controlled

Tools are functions the model decides to invoke. They are the MCP equivalent of function calling. The model sees the tool's schema, decides it needs to call it, generates the arguments, and the client executes the call against the server. Examples: running a database query, sending an email, creating a file.

JSON

// Tool definition as returned by tools/list
{
  "name": "query_customers",
  "description": "Queries the customer database. Accepts a SQL WHERE clause (read-only). Returns up to 100 matching rows as JSON objects.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "where_clause": {
        "type": "string",
        "description": "SQL WHERE condition, e.g. \"status = 'active' AND country = 'US'\""
      },
      "limit": {
        "type": "integer",
        "minimum": 1,
        "maximum": 100
      }
    },
    "required": ["where_clause"]
  }
}

Resources — Application-Controlled

Resources are data the application (not the model) decides to fetch. They use URI-based addressing and behave like read-only data sources. The application might attach a resource to the conversation as context, or the user might select one from a list. Examples: the contents of a file (file:///path/to/doc.md), a database record (db://customers/42), a configuration snapshot.

JSON

// Resource definition as returned by resources/list
{
  "uri": "db://customers/{id}",
  "name": "Customer Record",
  "description": "Full customer profile including contact info, order history summary, and account status.",
  "mimeType": "application/json"
}

The key difference from tools: the model does not invoke resources. The host application fetches them and injects the data into the conversation context. This is appropriate for large, read-only data that the model needs to reason about but should not be fetching on its own.

Prompts — User-Controlled

Prompts are parameterised templates triggered by the user. Think of them as slash commands. A prompt defines a reusable interaction pattern — for example, a code review template that takes a file path as an argument and produces a structured review.

JSON

// Prompt definition as returned by prompts/list
{
  "name": "code_review",
  "description": "Generates a structured code review for a given file",
  "arguments": [
    {
      "name": "file_path",
      "description": "Path to the source file to review",
      "required": true
    },
    {
      "name": "focus_area",
      "description": "Specific aspect to focus on: security, performance, readability",
      "required": false
    }
  ]
}

Why the Distinction Matters

Safety: Resources and prompts have human-in-the-loop control. Tools give control to the model. Separating these primitives makes it clear who authorises each action.
Token efficiency: Resources can be large documents injected once as context. If these were tools, the model would waste turns calling them and you would pay for the overhead of tool call round-trips.
UX design: Prompts map naturally to user-facing commands (slash commands, buttons). Tools map to model-driven automation. Conflating them leads to confused interfaces.

7. Tool Distribution Strategy

Why 5–7 Tools per Agent Is Optimal

There are three forces at play when you add tools:

Selection accuracy: With fewer options, the model picks the right tool more consistently. At 5 tools, selection accuracy is typically above 95%. At 20 tools, it can drop below 80% even with good descriptions.
Token cost: Each tool schema costs 200–400 input tokens. At 30 tools, you spend 6,000–12,000 tokens per request just on tool definitions.
Reasoning overhead: The model must consider all available tools before deciding. More tools means more internal reasoning, which translates to higher latency and more output tokens spent on deliberation.

Splitting a 30-Tool System

Suppose you have an e-commerce platform with 30 operations: product CRUD, order management, inventory, customer support, analytics, and shipping. Instead of handing all 30 tools to one agent, split them by domain:

Text

Routing Agent (3 tools)
  ├── route_to_product_agent    → Product Agent (6 tools)
  │                                ├── search_products
  │                                ├── get_product
  │                                ├── create_product
  │                                ├── update_product
  │                                ├── delete_product
  │                                └── get_product_reviews
  │
  ├── route_to_order_agent      → Order Agent (5 tools)
  │                                ├── create_order
  │                                ├── get_order
  │                                ├── cancel_order
  │                                ├── refund_order
  │                                └── list_customer_orders
  │
  └── route_to_support_agent    → Support Agent (6 tools)
                                   ├── search_tickets
                                   ├── create_ticket
                                   ├── update_ticket
                                   ├── assign_ticket
                                   ├── resolve_ticket
                                   └── get_ticket_history

The Routing Agent Pattern

The routing agent sits at the top of the hierarchy. It has only 3–5 tools, each of which delegates to a specialised sub-agent. The routing agent's job is classification, not execution. Its tool descriptions are broad: "Handles all product catalogue operations including search, CRUD, and reviews." The sub-agent's descriptions are precise.

This pattern has compounding benefits: each sub-agent gets a focused system prompt tuned to its domain, carries only the tools it needs, and can maintain domain-specific conversation context. The routing agent stays lightweight and fast.

Key concept: Tool distribution is not just about reducing token count. It is about giving each agent a clear, bounded scope so it can reason effectively. A focused agent with 6 tools will outperform a general agent with 30 tools on every metric — accuracy, latency, cost, and reliability.

8. Building an MCP Server

Let's build a complete MCP server that exposes a SQLite database as a set of tools. This server will allow Claude to query customers, look up orders, and check inventory — all through the standard MCP protocol.

Step 1: Project Setup

Bash

mkdir mcp-database-server && cd mcp-database-server
npm init -y
npm install @modelcontextprotocol/sdk better-sqlite3
npm install -D typescript @types/better-sqlite3
npx tsc --init

Step 2: Server Implementation

TypeScript

// src/index.ts
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import Database from "better-sqlite3";
import { z } from "zod";

// Open (or create) the database
const db = new Database("./store.db");

// Create the MCP server
const server = new McpServer({
  name: "database-server",
  version: "1.0.0",
});

// ── Tool 1: Query Customers ────────────────────────────
server.tool(
  "query_customers",
  "Searches the customers table. Accepts a search term that matches against " +
  "name or email (case-insensitive). Returns up to 20 matching customer " +
  "records with id, name, email, and status. Returns an empty array if " +
  "no customers match.",
  {
    search_term: z.string().describe("Name or email substring to search for"),
    limit: z.number().min(1).max(20).default(10).describe("Max results to return"),
  },
  async ({ search_term, limit }) => {
    const rows = db
      .prepare(
        `SELECT id, name, email, status FROM customers
         WHERE name LIKE ? OR email LIKE ?
         LIMIT ?`
      )
      .all(`%${search_term}%`, `%${search_term}%`, limit);

    return {
      content: [{ type: "text", text: JSON.stringify(rows, null, 2) }],
    };
  }
);

// ── Tool 2: Get Order Details ──────────────────────────
server.tool(
  "get_order",
  "Retrieves a single order by its numeric ID. Returns the order with " +
  "customer name, line items, total amount (USD cents), and current status. " +
  "Returns a not_found error if the order ID does not exist.",
  {
    order_id: z.number().int().positive().describe("The unique order ID"),
  },
  async ({ order_id }) => {
    const order = db
      .prepare(
        `SELECT o.id, o.status, o.total_cents, o.created_at, c.name as customer_name
         FROM orders o JOIN customers c ON o.customer_id = c.id
         WHERE o.id = ?`
      )
      .get(order_id);

    if (!order) {
      return {
        content: [{
          type: "text",
          text: JSON.stringify({
            error: true,
            errorCategory: "not_found",
            isRetryable: false,
            message: `No order found with ID ${order_id}`,
          }),
        }],
        isError: true,
      };
    }

    const items = db
      .prepare(
        `SELECT p.name, li.quantity, li.unit_price_cents
         FROM line_items li JOIN products p ON li.product_id = p.id
         WHERE li.order_id = ?`
      )
      .all(order_id);

    return {
      content: [{
        type: "text",
        text: JSON.stringify({ ...order, line_items: items }, null, 2),
      }],
    };
  }
);

// ── Tool 3: Check Inventory ────────────────────────────
server.tool(
  "check_inventory",
  "Checks current stock levels for a product by its ID or SKU. " +
  "Returns the product name, SKU, current quantity in stock, and " +
  "reorder threshold. Use this before creating orders to verify availability.",
  {
    product_id: z.number().int().positive().optional()
      .describe("Product ID (provide this OR sku, not both)"),
    sku: z.string().optional()
      .describe("Product SKU (provide this OR product_id, not both)"),
  },
  async ({ product_id, sku }) => {
    if (!product_id && !sku) {
      return {
        content: [{
          type: "text",
          text: JSON.stringify({
            error: true,
            errorCategory: "validation",
            isRetryable: true,
            message: "Provide either product_id or sku",
          }),
        }],
        isError: true,
      };
    }

    const row = product_id
      ? db.prepare("SELECT * FROM products WHERE id = ?").get(product_id)
      : db.prepare("SELECT * FROM products WHERE sku = ?").get(sku);

    if (!row) {
      return {
        content: [{
          type: "text",
          text: JSON.stringify({
            error: true,
            errorCategory: "not_found",
            isRetryable: false,
            message: `Product not found`,
          }),
        }],
        isError: true,
      };
    }

    return {
      content: [{ type: "text", text: JSON.stringify(row, null, 2) }],
    };
  }
);

// ── Start the server ───────────────────────────────────
async function main() {
  const transport = new StdioServerTransport();
  await server.connect(transport);
  console.error("MCP Database Server running on stdio");
}

main().catch(console.error);

Step 3: Configure the Client

To connect Claude Code (or Claude Desktop) to your MCP server, create a .mcp.json file in your project root:

JSON

{
  "mcpServers": {
    "database": {
      "command": "npx",
      "args": ["tsx", "src/index.ts"],
      "cwd": "/path/to/mcp-database-server",
      "env": {
        "DATABASE_PATH": "./store.db"
      }
    }
  }
}

For Claude Desktop, the equivalent configuration goes in claude_desktop_config.json:

JSON

{
  "mcpServers": {
    "database": {
      "command": "npx",
      "args": ["tsx", "/absolute/path/to/mcp-database-server/src/index.ts"],
      "env": {
        "DATABASE_PATH": "/absolute/path/to/store.db"
      }
    }
  }
}

Step 4: Test the Server

The MCP SDK includes a test inspector you can use to verify your server works before connecting it to Claude:

Bash

npx @modelcontextprotocol/inspector npx tsx src/index.ts

This opens a web UI where you can list tools, call them with test inputs, and inspect the responses — invaluable for debugging schemas and error handling before going live.

Key concept: Notice how each tool in the server follows every principle from this section: precise descriptions with action verbs and output formats, constrained schemas with Zod validation, and structured error responses with categories and retryability. MCP is just the transport — good tool design is what makes it work.

Section Summary

Tool design is where production quality is won or lost. The principles in this section apply whether you are using raw API tool calling or MCP:

Tool descriptions are the primary input for tool selection — treat them as API documentation for the model
JSON Schema constraints prevent malformed inputs before they reach your code
Structured errors enable self-healing agent loops instead of infinite retry spirals
MCP standardises discovery, transport, and invocation across any AI application
The three MCP primitives (tools, resources, prompts) map to three control surfaces: model, application, and user
Keep agents focused at 5–7 tools; use routing patterns for larger systems
Test MCP servers with the inspector before connecting them to production agents