Skip to main content

Command Palette

Search for a command to run...

Context Engineering Part 2: The Sliding Window Trap

Published
2 min read
A

I'm Software Engineer with 7+ years of experience in designing, implementing and debugging softwares including backend services, automation tools and Mobile SDK.

I love building things and currently building lessentext.com

The most common solution to AI amnesia is also the worst: sliding windows. It seems reasonable—keep only the most recent messages, drop the oldest when full. But this seemingly logical approach creates more problems than it solves.

The Sliding Window Approach

Here's how sliding windows work:

Implementation

class SlidingWindow:
    def __init__(self, window_size=10):
        self.window_size = window_size
        self.messages = []

    def add_message(self, message):
        self.messages.append(message)
        if len(self.messages) > self.window_size:
            self.messages = self.messages[-self.window_size:]

    def get_context(self):
        return self.messages

Simple. Clean. And deeply flawed.

The Critical Problem: You Lose What Matters Most

Consider this real conversation:

LOST: We're building an inventory system, using Django, with PostgreSQL, needing real-time tracking.

The AI knows about AWS and 10K users, but has no idea what we're actually building.

System Prompt Vulnerability

Even worse—losing the system prompt:

This isn't hypothetical—it's a real vulnerability in production systems.

When Sliding Window Works

Despite flaws, it works for:

  • Quick Q&A (independent questions)

  • Translation tasks (no long-term context)

  • Stateless API calls (self-contained requests)

  • Real-time chat support (only recent messages matter)

Priority Sliding Window: A Small Fix

Always keep the system prompt:

def priority_sliding(messages, window_size):
    system_msgs = [m for m in messages if m['role'] == 'system']
    other_msgs = [m for m in messages if m['role'] != 'system']
    
    available_space = window_size - len(system_msgs)
    recent_msgs = other_msgs[-available_space:]
    
    return system_msgs + recent_msgs

Better: System instructions preserved.
Still bad: Early conversation context lost.

The Core Issue

Sliding windows treat all messages as equally disposable. But in reality, some context is more valuable than others.

This brings us to token-based management—our next topic.


Read Part 3: Beyond Message Counting to learn how smart token allocation solves some of these problems.

#AI #ContextEngineering #SoftwareDevelopment #Tech

Context Engineering

Part 3 of 4

Explore the critical discipline of Context Engineering for Large Language Models (LLMs). This comprehensive series dives deep into the fundamental constraints of context windows, why traditional approaches fail, and how to build robust, efficient, and intelligent AI systems that truly understand and remember. Learn about advanced techniques like summarization, RAG, tool use, and causal-aware context management to overcome AI amnesia and unlock the full potential of LLMs in production environments.

Up next

Context Engineering Part 1: Why Your AI Chatbot Forgets Everything

Every Large Language Model has amnesia. And it's not a bug—it's a fundamental design constraint that costs companies millions in lost productivity and wrong code decisions. In this first part of our C