Building Memory for Claude Desktop: A Quick MCP Server Implementation

I've been using Claude Desktop for months, and there's one thing that constantly frustrated me compared to ChatGPT - the complete lack of memory between conversations. You mention you're working on three specific projects, then start a new conversation about something else, and suddenly you're back to square one explaining what you're working on. Meanwhile, ChatGPT just remembers these details automatically.

When Claude announced Integrations recently, I thought "now's the time to solve this properly". The timing couldn't have been better - the MCP integration is recent, making this the perfect moment to build something useful.

The Problem with Context

Claude has projects, which should theoretically solve this, but there's a fundamental issue: you can't update project artifacts. If there's an artifact in your project knowledge, it's frozen. That would have been the simple solution - just maintain global or project-specific artifacts that update with new information. Since that's not possible, I decided to build my own memory system.

The core frustration was simple: I didn't want to repeat myself across conversations. If I'm discussing a technical challenge in one chat and then asking about project timelines in another, the context should carry forward naturally.

A Couple of Hours Well Spent

I initially thought this would be straightforward - and it turned out to be exactly that. The Go ecosystem has several MCP libraries, and I used the most mature one I could find: github.com/mark3labs/mcp-go. It's actually not the official library, but it's solid and well-maintained.

The entire implementation took just a couple of hours using Claude Code exclusively. I started with a detailed PRD document, had Claude implement it, then reviewed and tested until everything worked. The development process was remarkably smooth - though I suspect I burned through quite a few API requests. Fortunately, being on the max plan meant it was all part of the package.

The main technical challenge was dealing with Go's context.Context cancellation. When Claude would stop calls or return them early, the context would be cancelled down the chain to the OpenAI embedding calls. This caused all sorts of issues with storing and retrieving memories correctly. The solution was creating completely separate contexts for the embedding operations. It took a few tries to get right, but nothing too complex.

How It Works

The Remember Me MCP Server does exactly what you'd expect: it gives Claude Desktop persistent memory across conversations. It uses PostgreSQL with pgvector for storing memories with vector embeddings, enabling both keyword and semantic search.

The system automatically detects when something should be remembered - whether it's a fact about your projects, a preference you've mentioned, or contextual information from your conversations. It categorises memories by type (fact, conversation, context, preference) and category (personal, project, business).

What's particularly satisfying is watching it work automatically. As I'm having this conversation about the project, it's storing relevant memories and looking up facts without me having to think about it. The system handles the complexity of deciding what's worth remembering and what can be forgotten.

Current Limitations

I should be honest about the current state: it's a bit slow. Response times are typically around two seconds, but can be longer when you're storing multiple facts or asking about several things simultaneously. The system makes multiple calls to build a complete picture, which adds up.

I suspect this is partly due to the number of calls between Claude Desktop and the MCP server rather than the server itself being slow. The actual service performs well, but the back-and-forth communication has overhead. This might just be the nature of MCP servers at the moment.

The Bigger Picture

This is very much an experiment rather than a permanent solution. My real hope is that Claude integrates memories natively soon, making tools like this unnecessary. But until that happens - whether it's in three months or twelve - this bridges the gap effectively.

The timing of releasing this feels good. It's exactly the kind of practical tool that demonstrates what's possible when you can extend Claude's capabilities directly.

For anyone interested in the technical details, the full implementation is available on GitHub. The setup is straightforward if you're comfortable with Go and PostgreSQL/Docker, and the documentation covers everything from quick installation to production deployment.

What's Next

I'm genuinely curious to see how this develops. The beauty of building these tools is that you learn by using them. Already, I'm noticing patterns in how I interact with Claude that I hadn't considered before.

More broadly, this feels like the beginning of a new phase for AI assistants. The ability to maintain context and memory across conversations changes how you use these tools fundamentally. It's no longer about optimising individual prompts, but about building a continuous working relationship.

The caching optimisations I've written about before definitely influenced the architecture here, and the approach to building AI tools feels increasingly relevant as these capabilities become more accessible.

Whether this specific implementation has legs or gets replaced by something better doesn't really matter. What matters is that we're finally getting the tools to build the AI interactions we actually want, rather than just accepting what's available.