
GitHub has a fake star problem…
Audio Summary
AI Summary
GitHub, which fundamentally changed how we think about software by providing a platform for hosting source code, fostering contributions, and building open-source communities, is facing a significant crisis of trust. The star count, a metric widely used to gauge a project's trustworthiness, adoption, and overall popularity, has been compromised by widespread fake stars. This issue is particularly alarming for venture capitalists (VCs) who invest millions based on these star counts, a simple test that can no longer be trusted.
Awesome Agents, an investigative group, has uncovered what they believe to be over 6 million fake stars on GitHub, linked to a VC funding pipeline that uses this manipulated popularity as proof of traction. This situation exemplifies Goodhart's law: "when a measure becomes a target, it ceases to be a good measure." The implications are profound for open source as a whole and for how VCs approach investments in open-source technology. The fear is that this manipulation will accelerate capital leaving the open-source world.
The problem is evident in charts showing new projects, like "Open Claw," rapidly surpassing established ones like React and Linux in star counts, often displaying an unnatural vertical growth curve. This shift is partly driven by a new demographic of "normies" on GitHub who are less interested in code and more in readily available applications, exemplified by Reddit posts demanding .exe files instead of installation commands. This broader user base means new projects have a larger potential reach, attracting VCs looking for breakout trends, especially in the AI space. These VCs, often lacking deep technical understanding, rely on superficial metrics like GitHub stars, making them susceptible to manipulation.
Awesome Agents' investigation claims that GitHub stars can be bought for as little as 6 cents each. The math is stark: manufacturing the median 2,850 stars needed for a seed round costs $85-$285, while hitting the 4,980 stars for a Series A round costs $1,000-$4,500. Considering these rounds can raise $1 million to $10 million, the return on investment for star manipulation is astronomical (3,500x to 117,000x). Thousands of repositories are exploiting this system.
This shadow economy is mature and professionalized, operating in plain sight. It was initially exposed by a peer-reviewed study from researchers at Carnegie Mellon, North Carolina State University, and Socket. Their tool, Star Scout, analyzed 20 terabytes of GitHub metadata, identifying approximately 6 million suspected fake stars across 18,600 repositories and 301,000 accounts between 2019 and 2024. The problem dramatically accelerated in 2024, with over 16% of all repositories with 50+ stars involved in fake star campaigns by July, up from nearly zero before 2022. GitHub itself recognized these as illegitimate, as 90% of flagged repositories and 57% of flagged accounts were deleted by January 2025.
AI and Large Language Model (LLM) repositories emerged as the largest non-malicious category receiving fake stars, surpassing blockchain and cryptocurrency projects. This includes academic paper repositories and LLM-related startups. Critically, 78 repositories with detected fake star campaigns appeared on GitHub trending, proving that purchasing stars can game discovery.
Dagster's 2023 research further confirmed this by purchasing stars from two vendors. A premium vendor, GitHub 24, charged €85 per 100 stars, delivering reliably. A budget service sold 1,000 stars for $64, though only 75% persisted. This ecosystem includes dedicated websites, freelance platforms, exchange networks, and underground channels. Different tiers of fake accounts exist, from disposable ones (3-10 cents per star) to premium aged accounts (80-90 cents per star) with slower, more natural delivery. Fiverr also hosts active gigs for GitHub promotion. Star exchange platforms like GitHub Starmate and Safe Star Exchange allow mutual starring through credit systems.
Beyond stars, tools like fake git history, commitbot, and committer fabricate GitHub contribution graphs. Pre-built GitHub profiles with 5-year commit histories and Arctic Code Vault badges sell for around $5,000 on Telegram. Vendors offer replacement guarantees for non-drop stars that evade GitHub's detection. Social Plug claims 3.1 million stars delivered to 53,000 clients, offering an API for programmatic purchasing. WeChat groups with over a thousand members process 20+ repositories, generating $3.4 million to $4.4 million annually in promoter profits.
GitHub's internal moderation capabilities are severely lacking. The platform, despite aiming to be a community hub, offers no comparable tools to manage spam, ban users, or control communication channels, unlike platforms such as Twitch. This absence of moderation tools makes it unsurprising that star-spamming platforms thrive, as GitHub primarily functions as a code hosting service rather than a well-managed community platform.
Analysis of fake stargazers reveals distinct patterns. Organic repositories are starred by seasoned developers with public repositories and followers. Manipulated repositories, particularly in the blockchain and AI sectors, show a high percentage of "ghost accounts" with zero public repositories, zero followers, and no bio, even if the accounts are aged. For example, Union Labs, ranked number one in Runa Capital's ROSS index, had 47% suspected fake stars, with 32.7% zero-repo accounts and 52% zero-follower accounts.
Key metrics for detection include the fork-to-star ratio and watcher-to-star ratio. Healthy projects like Flask have a fork-to-star ratio of around 0.16 (235 forks per thousand stars). Manipulated projects, like those in the blockchain world, show significantly lower ratios (e.g., Shared at 0.022 or 22 forks per thousand stars). Similarly, the watcher-to-star ratio, which indicates how many people actively watch for updates, is much lower in manipulated projects. For instance, "Free Domain" has a 0.001 ratio, meaning only one person watches for every thousand who star the repo.
Open-source AI projects like Raga AI and Open IFM also show clear manipulation signals, with high percentages of zero-follower and ghost accounts, and low median account ages for stargazers. Langflow, while initially flagged, showed improved metrics, but its low fork-to-star ratio still suggests less genuine adoption. Hermes Agent, despite some accusations, appears relatively organic, with a higher median age for stargazers and better fork-to-star ratios.
The connection between GitHub stars and startup funding is explicit. Jordan Seagal of Redpoint Ventures documented that the median GitHub star count at Seed Financing was 2,850 and at Series A was 4,980. VCs use internal scraping programs to identify fast-growing GitHub projects, with stars being the most common metric. Runa Capital's ROSS index, which ranks open-source startups by star growth, is widely cited by VCs. GitHub itself, through its GitHub Fund, invests in open-source companies partly based on platform traction. However, some examples cited as star-to-funding pipelines, like Lovable and Browserless, are misleading. Their funding was driven by exceptional revenue growth or high demand for their technology, not solely GitHub stars. Investors involved in these rounds confirm that GitHub stars were rarely a primary discussion point.
The incentive loop is self-reinforcing: VCs use stars, startups manipulate them, VCs see inflated traction, leading more VCs to track stars, and more startups to manipulate them. The AI sector's vulnerability is particularly high due to extreme hype, crypto-adjacent funding models, and an ecosystem prone to fabricated personas. GitHub's enforcement is reactive and asymmetric, removing flagged repositories but leaving many fake accounts intact, thus preserving the "labor force" of the fake star economy.
Researchers recommend GitHub adopt weighted popularity metrics based on network centrality rather than raw star counts, which would undermine the fake star economy. VCs should track unique monthly contributor activity, package downloads, issue quality, contributor retention, community discussion depth, and usage telemetry instead. A fork-to-star ratio below 0.05 for projects with over 10,000 stars warrants scrutiny.
This problem extends beyond GitHub. NPM downloads are trivially inflatable, and VS Code marketplace extensions are similarly vulnerable to fake installs. Social media platforms like X (Twitter) are used to amplify artificial GitHub virality through engagement pods and AI-generated content spam.
The legal ramifications are becoming clearer. The FTC's consumer review rule (effective October 21, 2024) prohibits selling or buying fake indicators of social media influence for commercial purposes, with penalties up to $53,000 per violation. The SEC has also charged CEOs with wire fraud and securities fraud for inflating metrics to deceive investors, with potential prison sentences. While no one has been charged specifically for fake GitHub stars yet, it may only be a matter of time.
This problem is also rampant in the tech YouTube space, where creators inflate viewership numbers to secure sponsorships. Many channels show millions of views with disproportionately low likes and comments, indicating fake viewership. This deception harms legitimate creators who adhere to ethical disclosure practices and makes it difficult to land brands who are misled by inflated metrics.
GitHub's acceptable use policies prohibit inauthentic interactions, but enforcement is weak and reactive. They remove flagged repositories but leave many fake accounts intact. GitHub has not published transparent reports or detection methods. Until GitHub implements structural changes like weighted popularity metrics, account-level reputation scoring, or transparent enforcement, the gap between star counts and genuine developer adoption will continue to widen. The "star economy" is a $50 problem with a $50 million consequence, and until platforms, investors, and regulators catch up, the market will continue to pay the price.