Context Engineering Part 4: Teaching AI to Take Smart Notes

This is the promise of summarization-based context management.

The Key Idea: Compress Without Losing Meaning

Token savings: 90% reduction Context preserved: Key facts retained

The Summarization Flow

Implementation

class SummarizingContextManager:
    def __init__(self, max_tokens=4000, threshold=0.8):
        self.max_tokens = max_tokens
        self.threshold = threshold  # Summarize at 80% capacity
        self.messages = []
        self.summaries = []

    def add_message(self, role, content):
        self.messages.append({"role": role, "content": content})

        if self._usage_ratio() > self.threshold:
            self._summarize_older_messages()

    def _summarize_older_messages(self):
        # Keep the 3 most recent messages
        recent = self.messages[-3:]
        older  = self.messages[:-3]

        if not older:
            return

        # Create summary using the LLM
        summary = self._call_llm_for_summary(older)
        self.summaries.append({
            "role": "system",
            "content": f"Previous conversation summary: {summary}"
        })

        # Replace old messages with summary
        self.messages = recent

    def get_full_context(self):
        return self.summaries + self.messages

    def _call_llm_for_summary(self, messages):
        conversation = "\n".join(
            f"{m['role']}: {m['content']}" for m in messages
        )
        prompt = f"""Summarize this conversation concisely. Preserve:
        - User's goals and requirements
        - Technical decisions made
        - Problems and solutions discussed
        - Constraints and preferences

        Conversation:
        {conversation}"""

        return llm_api_call(prompt)

Summarization Strategies

1. Incremental Summarization

Summarize in small chunks rather than all at once:

2. Structured Summarization

Use a structured template instead of free-form summaries:

def create_structured_summary(messages):
    prompt = """Create a structured summary with these sections:

    GOALS:       What the user wants to accomplish
    TECH_STACK:  Technologies and tools mentioned
    DECISIONS:   Important decisions made
    PROBLEMS:    Issues discussed and their status
    CONSTRAINTS: Limitations or requirements

    Conversation:
    {conversation}"""

    return llm_api_call(prompt.format(conversation=format(messages)))

Output:

SUMMARY:
  GOALS:       Build inventory management system with real-time tracking
  TECH_STACK:  React, TypeScript, Django, PostgreSQL, Redis, AWS ECS
  DECISIONS:   Using WebSocket for real-time, Redis for caching
  PROBLEMS:    WebSocket connections dropping — investigating nginx config
  CONSTRAINTS: Must handle 10K concurrent users, deploy by Q3

3. Hierarchical Summarization

Multiple levels of detail for different needs:

The Trade-offs of Summarization

What Gets Lost in Summarization?

The Deepest Problem: Decision-Rationale Separation

There's a specific failure mode of summarization that's worse than general information loss:

This is decision-rationale separation—when compression preserves what was decided but loses why it was decided. The AI then feels free to override the decision because it doesn't see the constraint behind it.

This is the most dangerous failure in AI-assisted software development because:

The user doesn't know the rationale was lost (it's invisible)
The AI sounds confident (it doesn't know it's contradicting anything)
The result is wrong code, architectural inconsistencies, or compliance violations

Beyond Summarization

Summarization is a massive improvement over simple dropping, but it still has an inherent ceiling: you're compressing lossy. Some information will always be lost.

What if, instead of trying to fit everything into the window, we could reach outside it?

This brings us to modern context engineering: RAG, tool use, and memory systems.

Read Part 5: RAG, Tools, and the Context Engineering Stack to learn how modern approaches extend beyond the context window.

#AI #Summarization #ContextEngineering #LLM