Claude Just Solved Their Biggest Problem

Believe it or not, but Claude has solved another major AI problem that we didn't even realize we had. As you know, Anthropic introduced MCP back in 2024, and it completely changed the way that we use AI. But their MCP system is far from perfect, and even they knew that. So that's why they released a paper talking about this issue. In this video, I'll take you through that paper, explain the problems, show you how it solves these problems, and how to implement the solutions. Even though Anthropic was the one who released the MCP system, why do they consider MCPs problematic? If you've used MCPS before, you know that they exist as tool calls in the context window. Each MCP has its own tools that remain exposed to the model in the context even before you perform any task with the agent. Even though the messages in this example occupy 0% of the context, the MCPS have already consumed about 10% of the total context window because MCP tools stay in the context regardless of whether you use them or not. And this was just for three MCPS connected. Now imagine how much context usage will increase when you connect more MCPS. This is exactly the problem highlighted by anthropic. They point out that tool definitions overload the context. And it's not just about the definitions. Results from MCP calls also remain in the context window consuming additional tokens. In this way, both the MCP tool definitions and the tool call results bloat the context window, gradually causing problems and taking up more space than the actual user messages and their content. Now that you understand the actual problem, I'll take you through the solution that they presented. So what can we do about this problem? Anthropic actually provided a solution in the paper that they released. They suggest that instead of having MCPs as separate tool calls, we can use the MCP server as code. Rather than having them as tool calls bloating the context window, we can leverage Claude's powerful coding capabilities and have it write whatever MCP you're using as a backend API instead of relying on the MCP server directly. This way, instead of loading an MCP directly into the context window, you can have all MCPS as separate code files representing a specific MCP tool. Then whenever the agent needs to access a particular tool, it can reference these files instead of relying on the MCP tools in the context window. This method of using MCPS as code is highly effective because it prevents the context window from being occupied by tool definitions. It allows for better context window management and provides more space for the actual work that needs to be done in the project. I won't go into implementation details for this solution, but if you want a dedicated video on the implementation, comment down below. To elaborate, Enthropic's idea is that instead of having separate MCP servers, we can have a servers folder that contains all the MCP servers. Within that folder, each file represents a specific function and there's an index file that acts as a base to let Claude know what tools are available in this MCP server. Cloudflare had actually already presented this exact idea and called it code mode. They mentioned that all MCP tools should be converted into a TypeScript API and then the LLM could write code for it. But if you're still not clear on why this is a big problem and why we shouldn't just keep using MCPS the way we were before, Entropic actually listed the benefits of this new solution. The first benefit of this solution is progressive disclosure or slowly revealing only the information required in the context window. We're using the agents ability to navigate the file structure, understand the purpose of a file from its name and load only what is required in the context. That's why each MCP is represented as a folder and within it are multiple files each corresponding to a specific tool. To coordinate all these files, we have an index.ts file that exposes all the tools along with their definitions. Essentially, this index.ts acts as a guide for Claude, helping it navigate which tools are available and how to call them. The main advantage is that whenever a particular MCP is needed, Claude references this index.ts and only loads the specific tool file into memory, revealing its context as required instead of loading all tools at once. The second benefit highlighted in this paper is context efficient tool results. As I mentioned earlier, all tool call results are normally exposed in the context window which can bloat it. With this approach, even if an MCP tool call produces a huge response, you can use code to transform, aggregate, and expose only the required portion of the data. For example, if you want the agent to know what type of data is in your Google Sheets, an MCP tool call returns all 10,000 rows, which only bloat the context window. So instead, you can send only the first five rows to the model for review. This prevents the context window from being unnecessarily bloated. The next benefit highlighted in this paper is more powerful and context efficient control flow. When using MCPs, tool calls must wait for each other sequentially, and the model makes these decisions. Enthropic emphasized that instead of having the model decide what to do and when to do it, Claude can manage this through logic within the code, reducing hallucinations and context window overload. Code can handle execution effectively using conditional statements, saving time since the model doesn't need to chain tool calls. The code manages the logic including checks and flow control instead of the model. When working with data, privacy is often a major concern. You don't want the model to access sensitive information in your database. The next benefit this approach provides is privacy preserving operations. With MCP as code execution, the agent only sees what the code logs or returns. For example, if your code is written to log only specific outputs, the agent only has access to that information and sensitive data remains hidden from the context window. If you want your MCP to keep your data private, you can design the code to return generic or placeholder values instead of the actual database content. This way, the agent can perform its tasks without ever accessing sensitive data, preserving your privacy. The final benefit is state persistence and skills. With the memory tool, the agent can save working code and intermediate results as files, allowing it to maintain state across operations and pick up execution exactly where it left off. This approach closely resembles Claude's skills that were released earlier. Both skills and MCP as code use a file systembased structure where capabilities are stored as files that Claude can discover and load on demand. The difference is that skills are broader and contain multiple resources documented in a skill.md file, while MCP as code focuses specifically on wrapping MCP server tools as executable code. You can also combine both approaches. When your agent writes code that successfully uses MCP tools, you can save that code and add a skill file to document it. This creates a reusable skill that the agent can reference in future tasks, building a library of capabilities over time and making development much more efficient. While this idea is truly impressive given all the benefits it offers, there is also a challenge. Running agent generated code requires a secure environment with proper sandboxing resources and monitoring. Although this approach reduces token costs and lowers latency, it also introduces technical complexities that need to be managed, this solution prompts you to make a choice between token costs and infrastructure complexity. To conclude, MCP as code can solve many of the problems we've been facing. While these ideas may feel new, they actually are not. They're standard practices that software engineers have been using for years. The key difference is that we're now extending these practices to AI agents, helping them become familiar with the real development environment. Before you go, last month we ran our biggest server event yet, the AI labs hackathon. We were absolutely blown away, receiving nearly 40 incredible projects. Before we get to the top three, we have to recognize a project that was pure innovation and deserved an honorable mention. That goes to our honorable member for his project, Convo Lang. Convol is an ambitious AI native programming language that seamlessly mixes prompting and procedural code. It's built to simplify creating complex AI agents, making things like tool calling, rag, and custom reasoning easy to manage in one consistent language. It's a huge, incredibly impressive undertaking. Fantastic work. Now, let's get into the top projects. Starting at number three, landing in third place is Emergency Contact Finder. Their project tackles a super important real world problem. It lets users create a unique QR code that they can attach to anything. When someone scans it, that person is immediately redirected to a designated emergency contact saved in the user's account. Simple, but brilliant for peace of mind. Coming in at second place is Core Notes. This is a performance-first offline capable desktop app designed as a productivity hub with four different modes for different users. The standout is the entrepreneur mode which packs features like AI idea generation, pitch structuring, and integrated financial tracking. All built to help founders move faster. And now our first place winner. Let's have a big round of applause for Ignasia Sparkfinder, a powerful platform that automatically finds and validates product opportunities or sparks from sources like Reddit. Using multimodel AI, it analyzes data and provides detailed scores for pain level, market viability, and confidence, effectively giving product builders a huge head start. And that's it for our winners. A massive thank you to everyone who submitted a project. You made this hackathon our best yet. Congratulations again to our top four. That brings us to the end of this video. If you'd like to support the channel and help us keep making videos like this, you can do so by using the super thanks button below. As always, thank you for watching and I'll see you in the next

Claude Just Solved Their Biggest Problem

Scores

Summary

Key Points

Key Takeaways

Primary Category

Secondary Categories

Topics

Entities

people

organizations

products

technologies

domain_specific

Sentiment

Content Type

Difficulty

Tone