Windsurf Track
Module 30
Windsurf Track -- Module 30
Refactoring ShopMate: Six months of rapid growth left ShopMate with a 500-line service file and duplicated prompt templates across three features. The developer uses Windsurf's advanced patterns -- TDD, incremental refactoring, architecture chat -- to clean it up without breaking the live customer-facing features.

Windsurf Advanced Techniques

These patterns separate productive Windsurf users from exceptional ones. Each addresses a professional development workflow that benefits significantly from Cascade's agentic capabilities. This module covers power-user strategies for test-driven development, multi-file editing, large codebase navigation, performance optimisation, and the iterative refinement patterns that produce production-quality code from AI-assisted workflows.

Core Advanced Patterns

Test-Driven Flows

Describe the desired behaviour; ask Cascade to write failing tests first, then implement to pass them. "Write tests for a rate limiter that allows 100 requests per minute per IP, then implement the middleware." This produces better-structured code and a built-in verification loop.

Iterative Refinement

Treat Flows as conversations. Start broad: "Implement the payment webhook handler." Review the plan, then refine: "Good, but use the strategy pattern for different payment providers, not a big if-else." Cascade updates its approach and re-plans.

Architecture Exploration

Use Chat mode for architectural discussions before implementation: "What are the trade-offs between a message queue vs direct API calls for our notification system, given our current infrastructure?" Cascade reasons over your specific codebase context.

Incremental Refactoring

For large-scale refactoring, use small incremental Flows over one giant change. "Migrate UserService.ts to the repository pattern" rather than "Migrate all services." Smaller diffs are easier to review and safer to apply -- and easier to roll back if something breaks.

Test-Driven Development with Cascade

Test-driven development (TDD) becomes significantly more powerful with Cascade because the AI can autonomously run the red-green-refactor cycle. Here is the process:

  1. Red: Ask Cascade to write failing tests that describe the desired behaviour. Be specific about edge cases. "Write tests for a currency converter that handles USD, EUR, GBP. Test: converting 0 returns 0. Test: unknown currency raises ValueError. Test: negative amounts raise ValueError."
  2. Green: In the same Flow (or a follow-up), ask Cascade to implement the code that makes all tests pass. Cascade runs the tests in the terminal, reads failures, and iterates until green.
  3. Refactor: Once tests pass, start a new Flow: "Refactor the currency converter implementation for clarity and performance. Do not break any existing tests." Cascade refactors with confidence because the test suite acts as a safety net.

The key insight: when you give Cascade tests to pass, you constrain its solution space. Instead of generating code that "looks right," it generates code that is provably correct against your specifications. This is the single most effective technique for getting high-quality output from AI-assisted development.

Text -- TDD Flow: Complete Red-Green-Refactor Cycle
# Step 1: RED -- write failing tests first

Write failing tests for a ShopMate discount calculator.

File: tests/test_discounts.py

Test cases:
1. 10% off a $50 item = $45.00
2. Buy-one-get-one on 3 items at $20 = $40.00 (cheapest free)
3. Stacking: 10% off + free shipping should apply 10% only to item price
4. Discount cannot make price negative (floor at $0.00)
5. Expired discount code raises DiscountExpiredError
6. Invalid discount code raises DiscountNotFoundError
7. Discount percentage > 100 raises ValueError at creation time

Do NOT implement yet. Just the tests. Run them to confirm they fail.

# Step 2: GREEN -- implement to pass

Now implement shopmate/pricing/discounts.py to make all tests pass.
Run: pytest tests/test_discounts.py -v after each change.
Iterate until all 7 tests are green.

# Step 3: REFACTOR -- clean up with safety net

Refactor shopmate/pricing/discounts.py:
- Extract the discount strategy into a DiscountStrategy base class
- Create PercentageDiscount and BOGODiscount subclasses
- Ensure all 7 tests still pass after refactoring

Multi-File Editing Strategies

Cascade's ability to edit multiple files in one operation is its most distinctive capability. But multi-file edits require careful prompt engineering to ensure consistency across files. Here are the strategies that produce the best results:

Strategy 1: Specify the Dependency Order

When files depend on each other, tell Cascade the order: "Create the model first, then the repository that uses it, then the service that uses the repository, then the router that uses the service." This prevents Cascade from writing a service that imports a repository with the wrong method signatures.

Strategy 2: Reference Existing Patterns

The most reliable way to get consistent code across files is to point Cascade at existing examples: "Follow the same pattern as @src/services/user_service.py and @src/repositories/user_repository.py." Cascade will replicate the structure, naming conventions, error handling patterns, and test style from your existing code.

Strategy 3: Pin Your Architectural Files

Pin your base classes, interfaces, and configuration files in the Cascade panel. When Cascade sees BaseRepository pinned, every new repository it creates will inherit from it. When it sees your Pydantic base model pinned, every new model will follow the same field naming conventions.

Strategy 4: Use Type Boundaries

Define the interfaces between your files explicitly in the prompt: "The repository returns Optional[User]. The service raises UserNotFoundError if the repository returns None. The router catches UserNotFoundError and returns 404." This prevents type mismatches at file boundaries, which are the most common multi-file editing error.

!
Review Multi-File Diffs Carefully

Multi-file edits are powerful but risky. Always review the complete diff before accepting. Pay special attention to: (1) import statements -- are they correct and consistent? (2) function signatures -- do the caller and callee agree on parameter types? (3) error handling -- is every error raised in one file caught in the appropriate place? The most common AI-generated bugs hide at file boundaries.

Large Codebase Navigation

When your project grows beyond a few dozen files, effective navigation becomes critical. Windsurf's semantic index helps, but you need to know how to leverage it:

Use Chat for code archaeology. Before touching unfamiliar code, ask Chat to map it: "Trace the complete call chain from the /checkout endpoint to the payment provider API call. List every file and function involved." This gives you a mental model of the code before you start editing.

Use @ mentions to focus Cascade's attention. In a large codebase, Cascade's automatic context selection may not pick the most relevant files. Explicit @ mentions guarantee the right files are in context: "@src/services/payment.ts @src/providers/stripe.ts Fix the double-charge bug in the webhook handler."

Use .windsurfrules to exclude noise. If your project has generated files, vendored dependencies, or large data directories, add them to .windsurfrules as excluded paths. This prevents Cascade from wasting context on files that should never be edited.

Text -- .windsurfrules: Large Codebase Settings
## Codebase Navigation Rules

# Directories to ignore (generated/vendored):
- Never read or modify files in: node_modules/, dist/, build/, .next/, coverage/
- Never read or modify: *.min.js, *.bundle.js, package-lock.json
- Generated files in src/generated/ are auto-created by protobuf -- do not edit

# Key entry points (start here when exploring):
- API routes: src/api/routes/
- Business logic: src/services/
- Data access: src/repositories/
- Shared types: src/types/index.ts
- Configuration: src/config/

# Module ownership (who to ask about what):
- Payment: src/services/payment/ -- owned by payments team
- Auth: src/middleware/auth/ -- owned by platform team
- Search: src/services/search/ -- owned by discovery team

Performance Optimisation with Cascade

Cascade can be a powerful performance analysis partner, but you need to give it the right data. Here are the patterns:

Profile-Driven Optimisation: Run your profiler, paste the output into a Flow: "Here is the output of cProfile for the /search endpoint. The top 3 bottlenecks are [paste]. Optimise the code to reduce these bottlenecks. Do not change the public API. Run the test suite after each optimisation to verify correctness."

Query Optimisation: Paste slow SQL queries (from your query logger or EXPLAIN output) into Chat: "This query takes 2.3 seconds on our products table (500k rows). Here is the EXPLAIN output. What indexes should we add?" Then use a Flow to implement the migration.

Algorithmic Improvements: Select a function and use Chat: "What is the time complexity of this function? Can it be improved?" Then use /refactor to implement the improvement.

Text -- Performance Optimisation Flow
# Profile-driven optimization of a slow endpoint

The /api/products/search endpoint is taking 1.8s avg (target: 200ms).

Profiling output (top bottlenecks):
1. shopmate/services/search.py:search_products() -- 800ms (DB query, no index)
2. shopmate/services/search.py:enrich_results() -- 600ms (N+1 query for categories)
3. shopmate/api/routes/products.py:serialize_response() -- 300ms (redundant Pydantic validation)

Optimise each bottleneck:
1. Add a database index for the product search query (create Alembic migration)
2. Fix the N+1 by eager-loading categories in the initial query
3. Use Pydantic model_construct() for the serialization hot path

Constraints:
- Do not change the API response format
- Do not change the search ranking logic
- Run: pytest tests/test_search.py after each change
- After all changes, the endpoint should handle 100 concurrent requests in < 300ms

Iterative Refinement in Practice

The most powerful advanced technique is iterative refinement -- treating a Flow as a conversation where you gradually steer Cascade toward the ideal solution. This works because Cascade retains context within a Flow session.

The pattern:

  1. Start broad: "Implement a notification system for order status changes."
  2. Review the plan: Cascade shows its approach. You see it is using polling instead of WebSockets.
  3. Refine: "Good structure, but use Server-Sent Events instead of polling. The client should receive real-time updates without refreshing."
  4. Review again: Cascade updates the plan. You notice it is not handling connection drops.
  5. Refine further: "Add automatic reconnection with exponential backoff in the client. Store missed events in Redis so they can be replayed on reconnect."
  6. Execute: When the plan matches your expectations, let Cascade implement it.

Each refinement narrows the solution space without requiring you to specify everything upfront. This is especially effective for complex features where you do not know all the requirements until you see the proposed approach.

Weak Flow Prompt

Refactor the codebase to use better patterns.

Strong Flow Prompt

Refactor src/services/user.ts to use the repository pattern. Extract all database queries into a UserRepository class. UserService should depend on a UserRepositoryInterface, not the concrete implementation. Do not change the public API of UserService. Add unit tests for both the repository and the service using mocks.

Power User Tips

These techniques are not documented in any official guide but dramatically improve the Windsurf experience:

1. Use "Show your reasoning" for complex tasks. Add "Explain your reasoning before making changes" to your Flow prompt. This forces Cascade to think through the problem before writing code, which catches logical errors early. You can review the reasoning and course-correct before any files are modified.

2. Use "Do not proceed until I confirm" for high-risk changes. For changes that affect production systems, database schemas, or security-critical code, add this line to your prompt. Cascade will present its plan and wait for your explicit approval at each step.

3. Use the "before and after" pattern for refactoring. Instead of saying "refactor this function," say "Here is the current behaviour [paste]. Here is the desired behaviour [describe]. Transform the code to produce the desired behaviour while keeping all existing tests passing." This gives Cascade a concrete target.

4. Chain small Flows instead of one giant Flow. Five focused Flows that each take 30 seconds produce better results than one sprawling Flow that takes 5 minutes. Each small Flow can be reviewed, accepted, or rejected independently. If Flow 3 of 5 goes wrong, you only lose Flow 3's changes, not the entire session.

5. Use Memories for recurring context. If you find yourself adding the same context to every Flow prompt ("All API responses must include a request_id field", "Use structlog, not print()"), add it as a Memory. Cascade will include it automatically in every future interaction.

i
The 80/20 Rule of AI-Assisted Development

Cascade typically gets you 80% of the way to the correct solution on the first attempt. The remaining 20% -- edge cases, nuanced business logic, performance characteristics -- requires human judgment and iterative refinement. Plan for this: budget time for review and iteration, not just prompt writing. The developers who get the best results treat Cascade output as a strong first draft, not a finished product.

Working with Generated Code

A common challenge: Cascade generates code that works but does not match your preferred style or uses patterns you would not choose. Rather than rewriting manually, use these techniques:

  • Pre-empt style issues in .windsurfrules: Add specific style rules for the patterns Cascade keeps getting wrong. "Use list comprehensions instead of filter()/map(). Use f-strings instead of .format()."
  • Post-generation /refactor: Accept the working code, then select it and use /refactor with a specific instruction: "/refactor Replace the nested for loops with itertools.product."
  • Review diff, not final state: Always review the diff, not the final file. The diff shows you exactly what changed, making it easier to spot issues than reading the entire file.

ShopMate -- TDD and Refactoring Flows

Text -- TDD Flow: Review Sentiment Feature
# Test-driven: write the tests first, then the implementation

Add a review sentiment classifier to ShopMate. Write failing tests first.

Feature: classify_review_sentiment(review_text: str) -> dict
Returns: {"sentiment": "positive|negative|neutral", "score": 1-5, "key_issue": str|None}

Tests to write FIRST in tests/test_sentiment.py:
1. "Softest tee I have ever worn, perfect fit" -> sentiment=positive, score>=4
2. "Runs very small, had to return it" -> sentiment=negative, key_issue contains "sizing"
3. "It arrived fine, seems ok" -> sentiment=neutral
4. Empty string input -> raises ValueError
5. Response must be valid JSON (Claude sometimes adds preamble)

Confirm tests FAIL before implementing.

Then implement in shopmate/reviews/sentiment.py:
- Use claude-haiku (cheap for classification)
- System prompt must force JSON output -- handle parse errors
- Call via logged_create(brand_id="internal", feature="review_sentiment")
Text -- Architecture Chat: Email Delivery
# Chat mode before implementing a significant new piece of infrastructure

Given our current ShopMate architecture in @shopmate/api/main.py
and the fact that we send ~500 emails per week:

Should we:
Option A: Generate and send emails synchronously in the FastAPI request
          (simple, but slow -- email sending can take 2-3 seconds)

Option B: Use FastAPI BackgroundTasks to generate and send asynchronously
          (faster response, but harder to handle failures)

Option C: Write emails to a queue (Redis list) and have a separate worker
          process them (most resilient, but more infrastructure)

Our current setup: a single FastAPI process on a single VPS, no Redis yet.
Expected volume: max 50 emails per hour during flash sales.
Which would you recommend and why?

Advanced Prompt Engineering for Flows

Beyond the basic prompt structure, advanced users develop prompt patterns that consistently produce better output. Here are the most effective techniques:

The Negative Constraint Pattern. Tell Cascade what NOT to do. This is often more effective than telling it what to do, because it eliminates the most common failure modes: "Do not use any deprecated APIs. Do not add dependencies that are not already in requirements.txt. Do not change the database schema."

The Verification Step Pattern. End your prompt with explicit verification instructions: "After implementation, run: mypy src/ --strict, then pytest tests/ -v, then flake8 src/. Fix any issues found. All three must pass cleanly." This creates a multi-tool verification loop that catches a wider range of issues.

The Rollback Safety Pattern. For risky changes, ask Cascade to make the change reversible: "Implement the new caching layer behind a feature flag. When ENABLE_CACHE=false (the default), the code path should be identical to the current behavior. Add a toggle endpoint: POST /admin/cache/toggle." This lets you deploy the change and enable it gradually.

i
Prompt Version Control

For complex, reusable prompts, save them as text files in your project (e.g., docs/flows/add-feature.txt). This lets you iterate on prompts over time, share them with the team, and track what worked. A well-tuned prompt that you use 20 times is worth the initial investment in crafting it.

Hands-On Exercises

i
Exercise 1: TDD with Cascade

Choose a utility function you need to write (a validator, a formatter, a calculator). Write a Flow that asks Cascade to create 5 specific failing tests first, confirm they fail, then implement the function to pass them. After all tests are green, start a second Flow to refactor the implementation without breaking the tests. Did the TDD approach produce better code than you would have gotten by asking Cascade to implement directly?

i
Exercise 2: Iterative Refinement

Start a Flow for a moderately complex feature. Let Cascade present its plan, then refine it at least 3 times before letting it execute. Track each refinement: what did you change and why? After the final implementation, compare the result to what the original plan would have produced. How much did the refinement improve the outcome?

i
Exercise 3: Multi-File Consistency

Create a new feature that requires at least 4 files (model, repository, service, tests). In your Flow prompt, explicitly specify the type boundaries between each file (what types are passed between layers). After Cascade generates the code, check: do the import statements work? Do the types match at every boundary? Are the method signatures consistent between caller and callee? Document any inconsistencies you find.

i
Exercise 4: Profile and Optimise

Find a slow endpoint or function in your project. Run a profiler (cProfile for Python, console.time for JS/TS) and capture the output. Paste the profiling data into a Flow and ask Cascade to optimise the top 3 bottlenecks. Verify with tests after each optimisation. Measure the before and after performance. How much did Cascade's optimisations improve throughput?

i
Exercise 5: Incremental Refactoring

Find the longest or most complex file in your project. Break the refactoring into 3-5 incremental Flows, each tackling one specific improvement (extract a class, split a function, introduce a pattern). Run the full test suite after each Flow. Compare this approach to trying to refactor the entire file in one Flow. Which produced fewer bugs? Which was easier to review?