Vercel Just Revealed Claude Code's Greatest Advantage
Processing Error
1 validation error for AnnotationOutput secondary_categories List should have at most 3 items after validation, not 4 [type=too_long, input_value=['AI Engineering', 'LLMs ...gramming & Development'], input_type=list] For further information visit https://errors.pydantic.dev/2.12/v/too_long
Ever since models started getting powerful, many people have started building really cool products, integrating models into them, and solving lots of problems for us. But these systems consume a lot of tokens, especially if you're actually integrating a model using an API. The solution to this is much simpler than you think. The best architecture is not some extreme pipeline or highly scaled tuning, but actually an old philosophy that forms the basis of a Unix-based system that everything is a file. Now, I know they weren't talking about model costs and we're talking about devices and files, but surprisingly enough, the solution to this high-cost issue is the exact same principle. And this is exactly what a software engineer at Versel talks about. Before we explore why files are the solution, let's understand a few things about how these models actually work. Models have been trained on massive amounts of code. This is the exact reason why they're better at understanding code, directory structures, and native bash scripts that developers use to navigate files and find what they need. When an agent uses GP and ls, it's not doing something new. It's simply doing something it already knows how to do, just in a more controlled way. This approach isn't limited to code. Agents can navigate any directory containing anything, be it code or not, because they're already comfortable with commands and understand file systems. Whenever an agent needs something, it looks around the file system using native bash commands like ls and find. Once the agent finds the exact file using find, it searches for relevant content within that file using pattern matching with grep and cat. Only a small relevant slice of information is sent to the model while the rest stays out of memory, keeping the context window clean. This means we're not burning through tokens on irrelevant data that the model doesn't need. Using this approach, the agent returns a structured output. This pattern worked so well that Versel ended up open sourcing a bash tool built specifically around it, giving agents the ability to explore file systems the same way a developer would. When building large language model systems, there are two ways of providing the right information to the model. Either through a detailed system prompt, hoping the agent actually follows it, or by feeding a lot of data into a vector database and using semantic search to extract it. But each approach has limitations. System prompts have a limited token window which limits how much information we can send to the model at a time. To handle larger data sets, we use semantic search which finds information based on matching meanings to the query. But vector search is used for semantic similarity rather than exact searches. It returns chunks of data that match the general context of the query, not necessarily the specific value we're looking for. This leaves extracting the right content from all the chunks to the model itself. File systems, however, offer a different approach. With file systems, the structure actually maps to your domain. You often have relationships between files in the folder structure that mirror the relationships between parent folders. With file systems, you don't have to flatten these relationships into model understandable vector chunks, which helps avoid missing relationships that are usually lost in semantic search. These hierarchical connections are preserved naturally, maintaining the organizational logic that already exists in your data. Another advantage is that retrieval is precise because Grep and Bash tools return exact matches. Unlike vector search, which returns all chunks that loosely match the query and then leaves it to the model to decide which one to use, you get only the required value. The context is minimal when agents use bash tools because they receive the specific chunk they need and many other chunks don't go into memory. This allows them to stay aligned and focused on the exact piece of information without getting lost in unrelated data. Now, this idea isn't something you are unfamiliar with. This idea has already been used inside Claude Code and all CLI agents where they use bash functions to narrow down findings using pattern matching. We've already been using the file system and Claude codes capabilities for research purposes for any idea that we evaluate. We usually pass the software tool we come across through this pipeline which contains multiple phases with our evaluation criteria that the research must pass through. All of this is defined in a markdown file containing the requirements and objectives of the tool we're testing, how to write the final document, and all the information required for each phase. We also provide Claude with certain documents as samples, which act as a guide for style matching, and the final document is saved in a research results folder. To guide the research, we have a claude.md file explaining how to pass the idea through each phase one by one, ultimately giving us a research document that meets all our checks. Whenever we have anything to research, I just go to Claude and tell it the idea or the tool to research. It then runs it through the six-phase validation process first by understanding the tool or idea and then passing it through each phase one by one. Once the idea has gone through all the phases, Claude generates a final report which we can read to verify whether the idea has potential or not. This file system approach saves us a lot of time by automating a research process that we would otherwise have to do step by step. If you want to try out this pipeline for yourself for your own use case, you can get a readytouse template for creating your own research pipeline similar to ours in our recently launched community called AI Labs Pro. For this and for all the previous videos, you get readytouse templates, prompts, all the commands and skills that you can plug directly into your projects. If you found value in what we do and want to support the channel, this is the best way to do it. Links in the description. I was going through their case study in which they explained how to build a sales summary. agent using this architecture. They've also open- sourced it, but it gave me a really interesting idea that I wanted to try out on my own. I was actually building a company policy project where I had a lot of company data in the form of JSON, markdown, and txt files, all separated by department. Normally, I would have implemented this system using a vector database like Chroma, but I decided to give this tool a shot. I went ahead and implemented this architecture. On the back end, I included the path to the document folder containing the company's data and gave the agent access to the ls, cat, gp, and find commands along with a guide on how to use the tool and when to use each command. I used the Gemini 2.5 flash model, provided it with Versel's bash tool, and gave it the path to the documents inside the tool. And so when I tested the agent by asking it any question related to the data, it basically answered based on the exact content from the company's policies, including the handbook and leave policy documents. To verify how it was working, I logged its tool usage on the terminal. The agent first used the ls command to see what documents were available and then used grep with pattern matching to look for off days or any similar terms. This set of commands handled our query and gave us results with the same level of accuracy as a rag system would. If you want the source code for this project, you can find it on our community from where you can download and try out for yourself. Now, the first question that came to my mind while going through this tool was, is it really safe to equip agents to execute commands on the server? We literally saw a vulnerability in React server components this past December, which scored a 10.0, the highest on the scale and it involved code being executed on the server. So this is a really powerful and potentially dangerous capability to give agents. So why did I actually trust this tool? It's because it runs in a sandbox and has isolation. It only accesses the specific directory we provide. It doesn't modify anything else. In the article, they also mentioned that the agent explores the files without access to the production system. So your production code remains safe even if the agent tries to run harmful commands on the server. It provides two types of isolation. The first is an in-memory environment. In this setup, it uses just bash tool which runs bash scripts only on the files it has access to just like we did when creating our agent. The second type is a fully compatible sandbox environment offering full virtual machine isolation using the versal sandbox. We can choose either based on our needs. The in-memory approach is lighter and faster for simple use cases while the full VM isolation is better when you need stronger security guarantees. Even though this approach is really good for saving costs per model call, it's not the right approach for all kinds of problems. It's definitely not ideal if you need to match the meaning of words because bash tools are for exact matching. As we saw when we called our agent, it used specific keywords to locate the required data. It's also not suitable for unorganized file structures where the agent would have to struggle with multiple tool calls. A structure that the agent can easily navigate is much better. My personal suggestion is to use the bash tool when you have highly structured data and your requests are mostly clear in terms of what you want. Use rag when you care more about the meaning of what's written in the files or when your queries are likely to be messy. Before we wrap up, here's a word from our sponsor, Brilliant. The best engineers don't just know syntax. They break down problems from first principles. That is why we've partnered with Brilliant. Their philosophy is that you learn best by doing. They prioritize active problem solving. So you get hands-on with concepts instead of just memorizing. For example, in their course named how AI works. You don't just read, you manipulate the actual logic. You'll get hands-on with technicalities like calculating loss in the loss space and visualizing interpolation, building a deep intuition you just can't get from a video lecture. Through their interactive technical courses, you get the most effective way to truly master the concepts we talk about. You'll also get 20% off on annual premium subscription, unlocking their entire catalog of math, data, and CS courses, giving you a complete road map to upskill. Click the link in the description or scan the QR code on your screen to claim your free 30-day trial. That brings us to the end of this video. If you'd like to support the channel and help us keep making videos like this, you can do so by using the super thanks button below. As always, thank you for watching and I'll see you in the next one.
Summary
The video explains how using file systems and native bash commands like ls, grep, and find can significantly reduce token usage and improve efficiency in AI agents, offering a more precise and cost-effective alternative to vector databases for structured data retrieval.
Key Points
- The main thesis is that file systems provide a powerful, low-cost architecture for AI agents by leveraging structured data and native command-line tools.
- Agents use commands like ls, grep, and find to navigate file systems and retrieve exact information, minimizing irrelevant data sent to the model.
- This approach preserves hierarchical relationships between files and avoids the semantic ambiguity of vector search.
- The method reduces token consumption by only sending relevant data to the model, keeping context windows clean.
- Versel open-sourced a bash tool that enables agents to explore file systems like developers do, improving accuracy and efficiency.
- The approach is ideal for structured data where queries are precise and relationships between files are meaningful.
- It's not suitable for unstructured data or meaning-based searches, where vector databases are better.
- Security is addressed through sandboxing and isolation, preventing harmful actions on production systems.
- The technique is already used in tools like Claude Code and CLI agents for research and data retrieval.
- The video provides a template for building a research pipeline using this approach, available in the AI Labs Pro community.
Key Takeaways
- Use file systems and native bash commands (ls, grep, find) to efficiently retrieve specific data for AI agents.
- Implement a structured file system to preserve relationships and enable precise data retrieval.
- Reduce model token usage by only sending relevant data to the model, avoiding context bloat.
- Choose this approach for structured data and exact queries; use vector databases for meaning-based or unstructured searches.
- Use sandboxed environments to safely run agents that execute commands on files without risking system security.
Annotations not available