← ContentsHome
Chapter 5

The Architecture of Context

Four components of context-aware AI: Collectors, Synthesizer, Model, and Interaction Layer.

Now that we understand the problem and can measure it, let's talk about solutions.

But first, a critical insight: This isn't just a UI problem.

You can't fix context tax by making chat prettier. You need a different architecture—one that gathers context programmatically instead of asking humans to type it in.

Four Components of Context-Aware AI

An in-situ tool (like Miguel's) has four main pieces:

[Context Collectors] → [Context Synthesizer] → [Model] → [Interaction Layer]

Let's break down each part.


1. Context Collectors

These components watch your environment and gather relevant signals automatically:

File Context

  • The current file
  • Related files (imports, importers)
  • Recently edited files
  • Files mentioned in recent commits

Code Structure Context

  • Abstract syntax tree for the selected region
  • Function signatures in scope
  • Type definitions
  • Documentation comments

Execution Context

  • Last test run results
  • Recent error messages and stack traces
  • Log snippets for the last N minutes
  • Debugger state (if attached)

Project Context

  • Git diff for current branch
  • Recent commit messages
  • CI/CD status
  • Configuration files (package.json, requirements.txt, etc.)

Environment Context

  • Language version
  • Framework versions
  • Database type/version
  • Environment variables

The key is selectivity. You don't dump the entire repository into every prompt. You gather a focused slice that matches the question.


2. Context Synthesizer

Raw signals aren't yet a prompt. The synthesizer structures them so the model can use them effectively.

Instead of blindly concatenating, it:

Labels sections semantically

Problem: [What the developer asked]
Current Code: [The function they're in]
Recent Error: [Last stack trace]
Related Code: [Imported modules]
Environment: [Versions, config]
Constraints: [From comments or docs]

Filters and prioritizes

  • Removes irrelevant log lines
  • Ranks files by relevance
  • Truncates long outputs intelligently
  • Highlights changed lines in diffs

Normalizes and annotates

  • Adds file paths to code snippets
  • Includes line numbers
  • Notes when code is from a generated file
  • Flags deprecated patterns

Example transformation:

Developer types: "why timeout"

Context Synthesizer builds:

Problem: Developer asks "why timeout"

Current File: tests/integration/user_service_test.py:45
Code:
  def test_create_user_with_profile():
      user = create_user(email="test@example.com")
      assert user.profile is not None  # ← This line times out in CI

Recent Error (CI):
  Timeout after 30s waiting for database query
  Connection pool exhausted: 10/10 in use

Environment Diff:
  Local: PostgreSQL 15.2, pool_size=20
  CI: PostgreSQL 14.2, pool_size=10 (.github/workflows/test.yml:28)

Recent Changes (git diff):
  + Added user.profile relationship in models.py:89
  + Uses jsonb_exists_any() function (requires PG 14.4+)

The developer wrote 2 words. The synthesizer provided the full diagnostic context.

This is where most of the value lives. A mediocre model with great context synthesis often beats a great model with poor context.


3. Model Integration

This is the part everyone focuses on, so I'll keep it brief.

The model receives structured context and generates a response. Ideally:

  • It's fast enough to feel interactive (<5s)
  • It's good enough to understand technical nuance
  • It has a large enough context window to use all the gathered signals

But model quality matters less than you'd think if the architecture is right. A slightly dumber model with perfect context will beat a genius model that has to guess.


4. Interaction Layer

This is what the developer sees and touches.

Invocation

  • Keybinding or command that works inline
  • No need to leave the current file
  • Can be as simple as selecting code + pressing a hotkey

Display

  • Small panel or overlay, not a full separate window
  • Shows the question and response
  • Feels transient, not permanent
  • Secondary to the code, not competing with it

Application

  • Responses appear as diffs, not as plain text to copy
  • One-click apply (with ability to edit first)
  • Clear indication of what will change
  • Easy undo if the suggestion is wrong

History

  • Minimal: maybe last 2-3 interactions for this file
  • Ephemeral: disappears when you switch files
  • Not trying to be a chat app

The goal: stay in the editor, stay in the code, minimize mode switching.


Why This Architecture Matters

Compare the architectures:

Chat-First Tools (Sarah's workflow):

Developer brain → manual text entry → model → text output → manual application

Human does: serialization, context gathering, integration.

Context-First Tools (Miguel's workflow):

Developer brain → simple query → [auto context gather] → model → structured diff → one-click apply

Computer does: serialization, context gathering, integration.

We've inverted the responsibility. The tool serves the human, not the other way around.