
They Stole AI With 24,000 Fake Accounts
AI Summary
Anthropic recently published a provocative blog post accusing three prominent Chinese AI companies—Deepseek, Moonshot, and Miniax—of conducting a massive, coordinated campaign to "steal" their AI capabilities. This operation involved approximately 24,000 fake accounts and 16 million exchanges. Anthropic has labeled this an "industrial-scale distillation attack" and framed it as a significant national security concern. To understand the gravity of these claims, one must first understand what model distillation is, how these specific companies allegedly carried it out, and the broader legal and ethical controversies surrounding the AI industry.
Model distillation is a technique used to create smaller, faster AI models by using a larger, more powerful model as a "teacher." In a standard training process, a company like Anthropic might spend hundreds of millions of dollars and several months training a massive model like Claude Opus. This requires scraping the entire internet, utilizing vast libraries of books, and employing massive GPU clusters. Distillation offers a shortcut. Instead of starting from scratch, a "student" model sends prompts to the "teacher" model and records the responses. Modern "thinking" models often provide a "chain of thought," showing the logic used to reach a conclusion. By capturing millions of these prompt-response pairs, the student model learns to mimic the teacher's reasoning and behavior at a fraction of the cost and time. While companies like Anthropic and OpenAI use distillation internally to create smaller versions of their own models—such as Claude Haiku or GPT-4o mini—the controversy arises when one company performs this process on a competitor’s model without permission.
According to Anthropic’s report from February 2026, the three Chinese firms used proxy networks and fraudulent accounts to bypass access restrictions. Deepseek allegedly engaged in over 150,000 exchanges, specifically targeting Claude’s reasoning capabilities. Notably, Anthropic claims Deepseek used Claude to generate "censorship-safe" responses for politically sensitive queries, effectively training their own models to steer conversations away from topics restricted by the Chinese government, such as authoritarianism or dissident leaders.
Moonshot AI was linked to 3.4 million exchanges, focusing on agentic reasoning, coding, and computer vision. Anthropic noted that Moonshot used hundreds of fraudulent accounts, but investigators were able to match some activity to the public profiles of senior Moonshot staff. The largest offender, Miniax, accounted for over 13 million exchanges. Anthropic observed that Miniax was so aggressive that when Anthropic released a new model, Miniax redirected half of its traffic to capture the new system's capabilities within just 24 hours.
However, there is a significant layer of irony in Anthropic's accusations. The transcript points out that Anthropic, like most major AI labs, built its models by scraping data from the internet without the explicit permission of the original creators. Anthropic has faced its own legal battles regarding this practice; in September 2025, the company settled a first-of-its-kind copyright suit with authors for $1.5 billion, paying roughly $3,000 for each of the 500,000 books used in its training data. They were also sued by Reddit for violating terms of service to scrape user data. This creates a "glass house" scenario: Anthropic is complaining that others are using its outputs without permission, yet its own models were built using data it did not initially have the rights to use. While some argue that scraping public data is "fair use" while distillation is a violation of "terms of service," the transcript suggests that the distinction is thin, as both involve taking others' hard work to build a competing product.
The implications of this conflict are three-fold. First, there is a major safety concern. When models are distilled, the rigorous safety guardrails and filters built into the original teacher model are often stripped away, leading to future generations of AI that may lack essential ethical protections. Second, there is a geopolitical dimension. Anthropic is using these attacks as evidence that the U.S. should maintain or tighten export controls on compute power. They argue that because Chinese labs are resorting to stealing capabilities rather than innovating independently, current restrictions are working. Finally, this situation highlights a massive legal gray area. Currently, the U.S. Copyright Office maintains that AI-generated content cannot be copyrighted because it lacks human authorship. Furthermore, most AI companies' terms of service state that the user owns the output. This raises the question: if the output isn't copyrightable and is owned by the user, is distillation actually "theft" in a legal sense, or simply a violation of a platform's rules?
Ultimately, the AI industry is currently operating on a foundation of "take first, ask permission never." While Anthropic’s claims of a systematic attack may be legitimate, they are seeking sympathy in an industry where everyone—including OpenAI, Meta, and Google—has been accused of similar data-scraping practices. The situation remains a messy conflict of power and legal maneuvering, leaving it unclear where the line for intellectual property will eventually be drawn in the age of artificial intelligence.