← ContentsHome
Chapter 12

Field Guide for Leaders and Builders

Actionable scorecards, pilot checklists, and concrete next steps for evaluators, engineers, and builders.

We've covered theory, metrics, anti-patterns, and architecture. Now let's make it actionable.


For Engineering Leaders

When evaluating an AI coding tool, use this scorecard:

Criterion Weight Score (1-5) Notes
Lives in the editor/IDE 5x ___ Can developers use it without alt-tabbing?
Auto-gathers context 5x ___ What context does it infer vs. ask for?
Shows diffs, not raw code 3x ___ How easy is it to apply suggestions?
Measures flow metrics 3x ___ Do they track TTCAA, reorientation time?
Privacy & security 4x ___ What's their data policy?
Integration quality 3x ___ How well does it fit our stack?

Weighted score = Σ (criterion score × weight)

  • 80-100: Strong candidate, pilot it
  • 60-79: Decent, but significant tradeoffs
  • <60: Probably not worth the context tax

Pilot Checklist:

  • Select 8-10 developers (mix of experience levels)
  • Measure baseline for 1 week (time-to-commit, context switches, flow quality)
  • Run tool for 2 weeks with same developers
  • Track: TTCAA, reorientation time, developer sentiment
  • Weekly check-ins: "What's working? What's frustrating?"
  • After 2 weeks: Compare metrics to baseline
  • Decision: Roll out, extend pilot, or abandon

Getting buy-in:

Present to leadership with this structure:

  1. The problem: "Developers spend 30-40% of their time on mechanical tasks: searching docs, writing boilerplate, debugging syntax."

  2. The promise: "AI can reduce that time significantly—if it respects developer flow."

  3. The risk: "Most AI tools increase context switching, which costs productivity instead of improving it."

  4. Our approach: "We'll pilot tools that minimize context friction, measure flow metrics, and only roll out if we see clear improvement."

  5. The ask: "2-week pilot with 10 developers, budget for 1-2 tool licenses, commitment to measure honestly."


For Senior/Staff Engineers

How to protect your own flow:

1. Be ruthless about tool choice

If a tool makes you alt-tab constantly, stop using it. Your attention is your most valuable resource.

2. Notice when you're explaining vs. building

If you spend more time writing prompts than writing code, something's wrong.

3. Measure your own context tax

Use the calculator in Chapter 4. Track one debugging session per week. See the pattern over time.

4. Advocate for better tools

When your team evaluates AI tools, bring the Flow Break Equation. Insist on pilots with real metrics.

5. Build your own workflows

If no commercial tool fits, consider lightweight custom tools:

  • Shell scripts that gather logs + code and pipe to an API
  • Editor macros that invoke AI with rich context
  • Git hooks that offer AI-assisted commit messages

For Tool Builders

Your development checklist:

Phase 1: Foundation

  • Choose target IDE (VS Code, JetBrains, etc.)
  • Build basic plugin: keyboard shortcut → AI call → show response
  • Implement context gathering: current file, selection, imports
  • Implement diff-based responses (not raw code blocks)

Phase 2: Context Expansion

  • Add runtime context: test results, logs, errors
  • Add project context: git diff, config files, dependencies
  • Implement context synthesis: structured prompts, relevance ranking
  • Add context visibility: show users what context is being used

Phase 3: Polish

  • Optimize context caching (AST, file structure)
  • Add inline diff preview with syntax highlighting
  • One-click apply + undo
  • Handle multi-file changes

Phase 4: Metrics

  • Instrument telemetry (with privacy safeguards)
  • Track: TTCAA, reorientation time, context provision ratio
  • Build internal dashboard for metrics
  • A/B test different context gathering strategies

Phase 5: Iterate

  • User research: watch developers use the tool
  • Count context switches during sessions
  • Identify and fix top friction points
  • Repeat

Anti-checklist (don't do these):

  • ❌ Build a beautiful chat UI first
  • ❌ Focus on model selection before context architecture
  • ❌ Require developers to explain context you can infer
  • ❌ Return raw code blocks without diffs
  • ❌ Measure only acceptance rate and ignore flow metrics
  • ❌ Launch without a pilot and telemetry

Recommended Reading & Tools

On Developer Flow:

  • Flow by Mihaly Csikszentmihalyi (the original research on flow states)
  • Deep Work by Cal Newport (on protecting focus)
  • The Pragmatic Programmer by Hunt & Thomas (timeless advice on dev productivity)

On Context and Complexity:

  • A Philosophy of Software Design by John Ousterhout (on managing complexity)
  • Working Effectively with Legacy Code by Michael Feathers (context is everything in legacy systems)

Tools for measuring flow:

  • RescueTime - tracks app usage, can show context switches
  • Toggl Track - manual time tracking with good analytics
  • WakaTime - IDE plugin that tracks coding time by file/project

AI Coding Tools (with in-situ features):

  • GitHub Copilot - inline suggestions, context-aware
  • Cursor - AI-first editor with good context gathering
  • Codeium - free, inline, multi-IDE support
  • Tabnine - team context learning

(Note: This isn't an endorsement. Evaluate each with the scorecard above.)