AI Agent Context Management: What I Built in Cont3xt

I noticed something unusual in the Cont3xt signups last week. About a dozen new accounts had email addresses from Agent Mail - an email platform built specifically for AI agents. Not developers experimenting with agents, not companies evaluating the product. Agents. Signing up autonomously, presumably because some human had pointed them at a task that involved managing team context and they'd found their way to the product. Coupled with my own experiments running OpenClaw - which reached 220,000 GitHub stars before its creator joined OpenAI and handed it to an open-source foundation - I'd been thinking hard about what it actually means to build software for agents as first-class users rather than as an afterthought to human workflows. I wrote about building a personal AI operations centre with OpenClaw a few weeks ago, and that hands-on experience is what made the agent context problem feel concrete rather than theoretical.

The result is a significant set of changes to Cont3xt: self-registration for agents, a new discoverability convention, a three-layer security model, and a governance layer that lets agents propose changes to team knowledge without being able to publish them unilaterally.

The Problem With How Agents Handle Context Right Now

Most of the discussion around autonomous agents focuses on what they can do - the capabilities, the autonomy, the length of the task chains they can execute. What gets less attention is what happens when they do things wrong, and specifically why they do things wrong. Missing context, almost always, not model capability.

When you're working with AI coding tools in a human-in-the-loop setup - Claude Code, Cursor, whatever you prefer - context failures are annoying but recoverable. You catch the wrong assumption before it lands. The agent suggests something that doesn't match how your team does things, you correct it, you move on. I've written about this in the context of why AI isn't making developers as productive as people think - the 63% of developers who report that AI tools lack crucial organisational context are mostly working in setups where a human is still catching the errors.

Autonomous agents break that completely. When an agent runs a 40-step task overnight, or monitors your inbox and files things away continuously, there's no human checking each decision, and a wrong assumption at step 3 propagates through everything that follows, silently, confidently, and often irreversibly by the time anyone notices. The METR study from July 2025 found that experienced developers were 19% slower using AI tools on complex real-world codebases, with the root cause being AI lacking tacit project knowledge - the kind that lives in ADRs, in team decisions, in the reasoning behind why the codebase looks the way it does. Remove the human from the loop entirely and the slowdown becomes invisible. The agent just does the wrong thing.

This is what Cont3xt was originally built to solve for human developers - giving AI coding tools a shared, authoritative knowledge layer that reflects how the team actually works. The agent signups made me realise I needed to make it work properly for agents too, and that meant rethinking the integration surface from scratch.

How Agents Were Using Cont3xt (And Where It Fell Short)

The original MCP integration worked well for human developers using Claude Code and similar tools. A context request would come in with a file path, the system would match it against relevant rules, ADRs, and team decisions, and return whatever was most relevant to the current task. Simple, effective, fast.

Agents have a different problem. A PR review agent isn't looking at one file - it's looking at fifteen simultaneously. A documentation agent has no file context at all. An inbox management agent is operating entirely in a domain that has nothing to do with the codebase. The file-path-based relevance model doesn't map cleanly onto these use cases, and there was no clean way for an agent to register itself, authenticate, or discover what the Cont3xt instance could do for it. A human developer reads the docs and sets up the MCP server. An agent needs to be able to do the equivalent autonomously, from a cold start, with minimal human involvement after the initial configuration.

Self-Registration via Team Tokens

The registration flow I've settled on is deliberately minimal. A team admin generates a registration token once - one token per team, rotatable independently of any individual agent's credentials. Any agent configured with that token and the team ID can POST to the registration endpoint and receive back a scoped API key, one call, no redirect, no OAuth dance, no human in the loop beyond the initial setup.

The reason for the token model rather than something more elaborate is operational simplicity. You want to be able to configure an OpenClaw agent or a Claude Code workflow with two values in a config file and have it be fully operational from that point forward. You also want to be able to rotate the registration token - if it leaks, or if you want to prevent new agents registering - without affecting any of the agents that have already received their own keys. Rotating the team token doesn't touch existing agent API keys, and revoking a specific agent's key doesn't affect any other agent.

Each registered agent gets its own identity in the system, and the activity log shows which agent fetched what context, when, and in what volume. Knowing that your PR review agent fetched the auth rules 47 times last week tells you something about where the complexity is, and also gives you confidence that the agent is actually using the context it's been given.

The request format for agents accepts a taskDescription field alongside the traditional file path. An agent reviewing a PR can pass a plain English description - "reviewing authentication implementation in PR #247" - and the relevance matching works off that rather than requiring a file path. The two approaches compose: agents that do have file context can pass both, and the system uses whichever signal is more useful.

skill.md: Standardising Agent Discoverability

Cont3xt now serves a skill.md file at a predictable URL, addressed directly to AI agents. The convention itself isn't novel - anyone building for agents will recognise the pattern from CLAUDE.md and similar files - but I wanted to standardise it for Cont3xt specifically, so that an agent told to "set up Cont3xt" can fetch a single URL and find everything it needs: step-by-step registration with curl examples, how to fetch rules, how to search context, how to use the MCP server, how to propose changes and poll for outcomes, a full API reference table, and a troubleshooting guide. There's also a /.well-known/agent.json endpoint for machine-readable discovery, for agents that prefer structured data over markdown.

Agents need a discoverable, machine-readable front door, and most tools don't have one. If an agent can't find the instructions autonomously, a human has to paste them into the agent's context every time, which defeats the purpose of building for agents as first-class users in the first place.

Governance: Agents Can Propose, Humans Approve

The write interception layer is the part I spent the most time on. Agents should be able to contribute to team knowledge - an agent that reviews enough PRs will notice patterns, recurring violations of the same conventions, new approaches that keep appearing, gaps in the documented rules, and that signal is genuinely valuable. At the same time, an agent that can publish directly to the team's canonical knowledge base is a significant governance risk. The Cisco security research on OpenClaw found a third-party skill performing data exfiltration via write access to the agent's memory layer. The specific attack surface is different, but the principle is the same: write access is dangerous.

The solution is a middleware layer that intercepts write requests from agent API keys before they reach the handler. A GET passes through normally - agents get full read access. A POST, PUT, or DELETE from an agent key gets captured, stored as a proposal with the agent's identity attached, and returns a 202 Accepted with a proposal ID. The underlying CRUD handlers don't know agents exist and the existing code is completely untouched.

A human reviews the proposal in the dashboard, can edit it, approve it, or reject it. On approval, the system executes the original operation inside the approval transaction - not just a status flag flip, it runs the actual database write with the proposed data, with row-affected checks to catch anything that's gone stale in the meantime. The resulting entity ID gets recorded on the proposal for traceability, and the agent ID gets attached to the created or updated entity as attribution. The agent is opening a pull request, not pushing to main.

One constraint worth noting: an agent cannot approve its own proposals, manage other agents, access billing or team settings, or rotate its own API key. The route restriction layer maintains an allowlist of path prefixes that agent keys can hit, and anything outside that list gets a 403 before it reaches any handler.

What's Still Being Worked Out

Context window management is the biggest open question. Cont3xt caches responses in Redis with a five-minute TTL, and there's a lightweight staleness endpoint that lets long-running agents poll cheaply to check whether relevant rules have changed since their last fetch. But the volume of context returned needs careful tuning - too little and the agent misses relevant rules, too much and you're burning context window on material that isn't useful for the current task. I haven't run enough real-world agent workloads against it yet to know where the right defaults sit.

Agent compliance is the other issue. Claude Code agents tend to be reasonably diligent about using the tools available to them when given clear instructions. OpenClaw-style agents are more variable - they'll sometimes skip a context fetch they should be making, particularly mid-task when the setup instructions are further up in the conversation history. The practical fix is putting the fetch instructions directly in the agent's SOUL.md or equivalent persistent memory so the behaviour is baked in, and it works, but it means the integration is more fragile than I'd like until agent tooling gets better at maintaining consistent behaviour across long-running tasks.

I've written before about building the VS Code extension for Cont3xt and the challenge of making context delivery feel natural rather than disruptive. The agent integration has the same underlying problem - the best context fetch is one the agent makes without thinking about it, at exactly the moment it's relevant, and getting there requires both better tooling on the agent side and better relevance filtering on the Cont3xt side.

Where This Goes

The Agent Mail signups were a small signal but a clear one. Agents are already finding their way to tools they need, and that's going to accelerate as autonomous agent setups become more common. The teams getting the most from autonomous agents are the ones treating agent setup the same way they treat new hire onboarding - you wouldn't drop a new engineer into a complex codebase with no documentation, no ADRs, and no way to ask questions about how things are done, and running an autonomous agent without a shared knowledge layer beneath it is exactly that.

If you're running autonomous agents and hitting the context problem, Cont3xt.dev has agent registration available now. The skill.md is at cont3xt.dev/skill.md - an agent can get itself set up from there without any further instruction.

I write about this stuff every week. If you want to keep up with what's changing in Claude Code, Cursor and AI dev tooling, along with the Go and infrastructure work I do, the newsletter is where it all goes first.

Join the newsletter - it's free

I also do consulting on AI implementation and technical strategy. If you're working through something specific, get in touch.

Teaching AI Agents to Find Their Own Way Around Cont3xt

The Problem With How Agents Handle Context Right Now

How Agents Were Using Cont3xt (And Where It Fell Short)

Self-Registration via Team Tokens

skill.md: Standardising Agent Discoverability

Governance: Agents Can Propose, Humans Approve

What's Still Being Worked Out

Where This Goes

The Claude Agent SDK: What It Is and Why It's Worth Understanding

Subscribe

Teaching AI Agents to Find Their Own Way Around Cont3xt

The Problem With How Agents Handle Context Right Now

How Agents Were Using Cont3xt (And Where It Fell Short)

Self-Registration via Team Tokens

skill.md: Standardising Agent Discoverability

Governance: Agents Can Propose, Humans Approve

What's Still Being Worked Out

Where This Goes

The Claude Agent SDK: What It Is and Why It's Worth Understanding

You might also like

The Claude Agent SDK: What It Is and Why It's Worth Understanding

I Built a Cost Tracker for Claude Code to See If My Subscription Was Worth It

Claude Code Pricing Guide: Which Plan Actually Saves You Money

Subscribe

What I'm building and learning, weekly