
Bash is bad for agents
Audio Summary
AI Summary
The current landscape of AI agents interacting with our systems is a significant leap from the past. Previously, AI models were limited to generating text, requiring users to manually copy and paste commands into their terminals and struggling to provide context about their codebase through cumbersome compression techniques. Today, tools like Cursor, Claude Code, and T3 Code enable AI agents to directly interact with our systems, primarily through bash commands. These agents can read code, apply changes, and execute various operations, leveraging the ubiquity of bash for system control.
However, the speaker argues that relying solely on bash is insufficient for the future of AI agents. While bash has been a crucial stepping stone, it represents just one component of a more advanced interaction model. The core of an AI's capability lies in its ability to generate text, acting as a sophisticated autocomplete system. This process is heavily influenced by its "context window"—the amount of information it can process at once. The way text is tokenized, broken down into smaller units, significantly impacts how well the model understands and processes information. Newer tokenization methods are more efficient, especially for code, by grouping related elements, thus reducing the number of tokens and improving the model's performance.
A major challenge arises when trying to provide AI models with extensive context, such as an entire codebase. While intuitively one might think dumping the entire codebase into the context window would yield the best results, this is counterproductive. Large context windows lead to models becoming "dumber" and less effective due to the sheer volume of data to process. This is why tools that indiscriminately ingest entire codebases are detrimental, leading to increased costs, slower responses, and degraded output quality. The speaker criticizes such approaches, likening them to the worst possible way to code with AI.
Instead of overwhelming models with all available information, the key is to enable them to find the specific context they need. This is where bash commands become valuable, as they allow models to query and retrieve relevant information. The speaker highlights the deterministic nature of bash commands, which, unlike the inherent non-determinism of AI generation, provide consistent results. For instance, a bash command to search for a specific piece of code will reliably return the same result each time. This contrasts with AI, where the same prompt can yield slightly different outputs. The more tokens an AI processes, the more its output can become akin to random chance, whereas precise bash commands offer a more predictable path.
The argument against bash as the ultimate solution stems from its limitations as an execution layer. While effective for retrieving context and applying changes, bash lacks standardization and sophisticated control mechanisms. Challenges arise in managing user states across different AI tools, sharing approval methods for commands, and implementing granular permission controls (e.g., auto-approving read-only operations while requiring approval for destructive actions). Bash, being a command-line interpreter, doesn't inherently support these structured interactions. The absence of standards for identifying destructive commands or managing permissions means each tool must reinvent these solutions, bloating context and hindering efficiency.
The future points towards a more structured and typed execution environment. TypeScript is presented as a promising alternative. Its ability to be executed in various isolated environments, such as V8 isolates, Node.js, or Cloudflare Workers, offers a safe and portable way for AI agents to interact with systems. This approach allows for better resource management and security, preventing agents from affecting other users' data or the underlying system. Cloudflare's "Code Mode" demonstrated this by converting API interfaces into TypeScript SDKs, enabling models to write code that interacts with these services more efficiently and with less context. This method significantly reduces token usage and improves response speed and accuracy by allowing code to perform filtering and operations deterministically.
The speaker also discusses projects like "just Bash" and "just JS," which aim to provide virtualized bash or JavaScript/TypeScript execution environments. These solutions allow AI agents to interact with a simulated system, preventing unintended consequences and enhancing security. The idea of a TypeScript file configuring an agent's environment is particularly appealing, offering portability and team collaboration benefits.
Ultimately, the future of AI agent interaction with systems requires moving beyond the limitations of bash. The speaker emphasizes the need for typed environments with clear inputs and outputs, the ability to proxy calls, and cost-effective, portable, and well-isolated execution layers. While the exact solutions are still emerging, the direction points towards languages like TypeScript and secure execution sandboxes. The current era is characterized by rapid experimentation and innovation, with opportunities for individuals to contribute to shaping how AI agents will interact with our digital world. The speaker encourages developers to explore these emerging tools, identify problems, and contribute to building the future of AI-powered systems.