Now that we understand the problem and can measure it, let's talk about solutions.
But first, a critical insight: This isn't just a UI problem.
You can't fix context tax by making chat prettier. You need a different architecture—one that gathers context programmatically instead of asking humans to type it in.
Four Components of Context-Aware AI
An in-situ tool (like Miguel's) has four main pieces:
[Context Collectors] → [Context Synthesizer] → [Model] → [Interaction Layer]
Let's break down each part.
1. Context Collectors
These components watch your environment and gather relevant signals automatically:
File Context
- The current file
- Related files (imports, importers)
- Recently edited files
- Files mentioned in recent commits
Code Structure Context
- Abstract syntax tree for the selected region
- Function signatures in scope
- Type definitions
- Documentation comments
Execution Context
- Last test run results
- Recent error messages and stack traces
- Log snippets for the last N minutes
- Debugger state (if attached)
Project Context
- Git diff for current branch
- Recent commit messages
- CI/CD status
- Configuration files (package.json, requirements.txt, etc.)
Environment Context
- Language version
- Framework versions
- Database type/version
- Environment variables
The key is selectivity. You don't dump the entire repository into every prompt. You gather a focused slice that matches the question.
2. Context Synthesizer
Raw signals aren't yet a prompt. The synthesizer structures them so the model can use them effectively.
Instead of blindly concatenating, it:
Labels sections semantically
Problem: [What the developer asked]
Current Code: [The function they're in]
Recent Error: [Last stack trace]
Related Code: [Imported modules]
Environment: [Versions, config]
Constraints: [From comments or docs]
Filters and prioritizes
- Removes irrelevant log lines
- Ranks files by relevance
- Truncates long outputs intelligently
- Highlights changed lines in diffs
Normalizes and annotates
- Adds file paths to code snippets
- Includes line numbers
- Notes when code is from a generated file
- Flags deprecated patterns
Example transformation:
Developer types: "why timeout"
Context Synthesizer builds:
Problem: Developer asks "why timeout"
Current File: tests/integration/user_service_test.py:45
Code:
def test_create_user_with_profile():
user = create_user(email="test@example.com")
assert user.profile is not None # ← This line times out in CI
Recent Error (CI):
Timeout after 30s waiting for database query
Connection pool exhausted: 10/10 in use
Environment Diff:
Local: PostgreSQL 15.2, pool_size=20
CI: PostgreSQL 14.2, pool_size=10 (.github/workflows/test.yml:28)
Recent Changes (git diff):
+ Added user.profile relationship in models.py:89
+ Uses jsonb_exists_any() function (requires PG 14.4+)
The developer wrote 2 words. The synthesizer provided the full diagnostic context.
This is where most of the value lives. A mediocre model with great context synthesis often beats a great model with poor context.
3. Model Integration
This is the part everyone focuses on, so I'll keep it brief.
The model receives structured context and generates a response. Ideally:
- It's fast enough to feel interactive (<5s)
- It's good enough to understand technical nuance
- It has a large enough context window to use all the gathered signals
But model quality matters less than you'd think if the architecture is right. A slightly dumber model with perfect context will beat a genius model that has to guess.
4. Interaction Layer
This is what the developer sees and touches.
Invocation
- Keybinding or command that works inline
- No need to leave the current file
- Can be as simple as selecting code + pressing a hotkey
Display
- Small panel or overlay, not a full separate window
- Shows the question and response
- Feels transient, not permanent
- Secondary to the code, not competing with it
Application
- Responses appear as diffs, not as plain text to copy
- One-click apply (with ability to edit first)
- Clear indication of what will change
- Easy undo if the suggestion is wrong
History
- Minimal: maybe last 2-3 interactions for this file
- Ephemeral: disappears when you switch files
- Not trying to be a chat app
The goal: stay in the editor, stay in the code, minimize mode switching.
Why This Architecture Matters
Compare the architectures:
Chat-First Tools (Sarah's workflow):
Developer brain → manual text entry → model → text output → manual application
Human does: serialization, context gathering, integration.
Context-First Tools (Miguel's workflow):
Developer brain → simple query → [auto context gather] → model → structured diff → one-click apply
Computer does: serialization, context gathering, integration.
We've inverted the responsibility. The tool serves the human, not the other way around.