
Anthropic just…wait what
Audio Summary
AI Summary
The primary driver behind recent changes and partnerships in the AI landscape, particularly involving Anthropic and XAI, is a critical compute bottleneck. Theo's analysis, suggesting compute as the limiting factor rather than pricing power, is validated by Anthropic's struggles to meet unprecedented demand for its models, especially Claude. This demand surge, exceeding even Anthropic's ambitious 10x annual growth projections by reaching an astonishing 80x in the first quarter, has placed immense pressure on their compute resources.
Anthropic's compute problem stems from a conservative approach to purchasing GPUs, leading to a shortfall. While OpenAI aggressively acquired compute in anticipation of scaling needs, Anthropic's leadership was more cautious, aiming to avoid overcommitment. This has forced Anthropic to diversify its compute providers, utilizing a mix of Amazon's Tranium, Google's TPUs, and Microsoft's Azure, all distinct from the preferred NVIDIA GPUs that researchers favor due to the CUDA ecosystem. The hypothesis is that Anthropic is attempting to offload inference tasks to non-NVIDIA hardware to free up NVIDIA resources for research. However, this hasn't been enough to meet demand.
The recent partnership between Anthropic and SpaceX/XAI, despite a history of animosity, underscores the severity of Anthropic's compute shortage. This alliance allows Anthropic to access SpaceX's substantial NVIDIA compute capacity, specifically from their Colossus 1 data center, which was largely underutilized. The deal provides Anthropic with over 300 megawatts of power and 220,000 NVIDIA GPUs, effectively most of Colossus 1, which was initially intended for XAI's Grok inference. The implication is that XAI, with its limited Grok user base, has an abundance of idle compute.
This partnership also sheds light on XAI's strategic objectives, particularly its bid for Cursor. XAI, while possessing significant compute power, lacks the crucial data needed to train advanced models, especially for coding tasks. Twitter's data, while extensive, is not ideal for nuanced model training due to its inherent biases and noise. XAI's acquisition of Cursor is seen as a move to secure a vast corpus of high-quality data generated from user interactions with coding models, data that is invaluable for reinforcement learning and improving model behavior. Cursor, conversely, has the data but lacks the compute. The $10 billion offer for Cursor's data or a $60 billion acquisition if the entire company proves valuable highlights this data gap.
Anthropic's previous ban on XAI using its models within Cursor was a direct attempt to prevent XAI from accumulating this valuable training data. This move, which strained Anthropic's relationship with Cursor, likely precipitated Elon Musk's decision to partner with Anthropic, signaling a strategic maneuver to counter OpenAI.
The compute crisis has also impacted end-users of Anthropic's products, particularly Claude Code users. Recent changes, such as the potential removal of Claude Code from the Pro plan, were not primarily aimed at increasing revenue but at freeing up compute resources. The introduction of higher usage limits for Claude and doubled 5-hour rate limits for Claude Code reflect the increased compute availability stemming from the SpaceX deal. However, the benefits of these increased limits are nuanced. While users who engage in short, bursty coding sessions will see improvements, those using parallel agents or tools that maintain continuous compute usage may not experience significant advantages, especially if they were already hitting weekly limits.
The API rate limits for enterprise customers have seen substantial increases, particularly for input tokens, addressing a previous bottleneck that limited concurrent requests. Tier 1 input limits have jumped from 30,000 to an unspecified higher number, and output limits have also been significantly boosted across tiers. This expansion is crucial for enterprise clients who require robust and scalable API access.
The performance characteristics of models across different cloud providers are also noteworthy. Amazon consistently delivers the fastest inference speeds, often exceeding 80 tokens per second, while Anthropic's own hosting and Google's TPUs show solid, albeit lower, performance. A curious observation is the near-identical performance of Anthropic models on Azure and their official hosting, leading to speculation that Azure might be proxying requests to Anthropic's own infrastructure rather than hosting them directly. This contrasts with the significantly different performance observed when other companies host NVIDIA hardware.
The competitive landscape is intensifying, with OpenAI posing a significant threat. OpenAI's impending support for AWS, a platform where Anthropic previously held a strong advantage through its integration with Amazon Bedrock, erodes Anthropic's competitive edge. Furthermore, OpenAI's advancements in code generation with CodeX are challenging Anthropic's historical dominance in this area.
The partnership between Anthropic and XAI, despite their past conflicts, is a testament to the strategic imperative of securing compute. XAI possesses compute and has now acquired data through Cursor. Anthropic has world-class researchers and data but critically lacks compute. OpenAI stands as the only entity currently possessing all three essential components: research, data, and compute. The desperation to compete, particularly against OpenAI, is driving these unconventional alliances, where past grievances are set aside for strategic advantage.
Elon Musk's statement about leasing Colossus 1 to Anthropic while SpaceX moves training to Colossus 2 is significant. Colossus 1, a massive data center with 442 megawatts of power and 280,000 H100 equivalents, was primarily allocated for Grok inference. Given Grok's limited user base, a substantial portion of this capacity, over 300 megawatts and 220,000 GPUs, has been leased to Anthropic. This suggests XAI requires only a small fraction of Colossus 1's capacity for its own inference needs, highlighting the underutilization of their compute resources. Colossus 2, an even larger facility with 1.5 gigawatts of power and 1.4 million H100 equivalents, is expected to further bolster SpaceX's compute capabilities.
The core insight remains that compute is a long lead-time acquisition. Companies like Elon Musk's, who invested early in compute without immediate widespread user demand, are now in a position to leverage this surplus. Conversely, Anthropic, experiencing unexpected exponential growth, faces a compute deficit. This dynamic creates a "match made in heaven" scenario, where desperation for compute drives unlikely partnerships, all in the pursuit of closing gaps and competing effectively in the rapidly evolving AI arena. The willingness to set aside personal animosities, especially towards OpenAI, underscores the high stakes and the critical role of compute in achieving AI dominance.