Overview
A Chinese proxy market selling Claude API access at roughly 10% of list price has surfaced. On the surface it looks like simple price arbitrage; one layer down it is a pipeline that bundles quality degradation with prompt data theft. What makes the case worth dwelling on is something else entirely — it is the sharpest illustration yet that answering the demand to “guarantee model performance” requires shifting the unit of conversation from mathematics to economics.
The structure of the case — what sat beneath the “it is cheap” signal
According to reporting by the Korea Management Journal, Claude API access was being resold on channels like GitHub, Telegram, and Taobao at about 90% off list price. The discount does not come from a legitimate supply chain. It comes from mass-generated free-trial accounts, subscriptions opened with stolen credit cards, a single Max-tier account ($200/month) subdivided across many users, and — the most insidious mechanism — model substitution: the user believes they are calling Claude Opus but actually receives responses from cheaper Haiku or an open-weight model.
The key number comes from an analysis of 17 proxy services by the CISPA Helmholtz Center for Information Security. The official API scored about 84% accuracy on a medical benchmark; routed through a proxy, that fell to roughly 37%. Same price tag, same API shape, less than half the real performance.
And then the deeper layer — data theft. Proxy operators collect users’ prompts, model responses, and chain-of-thought reasoning traces, and repackage them as training datasets. Oxford China Policy Institute researcher Zhilan Chen calls this the “API Proxy Economy.” Anthropic reported detecting roughly 24,000 fraudulent accounts that generated more than 16 million queries in February 2026, and has accused DeepSeek of using thousands of fraudulent accounts to generate millions of conversations with Claude to train its own models.
Why a mathematical guarantee was never possible
The demand to “guarantee model performance 100%” feels intuitively reasonable. But LLM output is inherently stochastic. Temperature sampling, context dependence, the residual probability of hallucination — no single model can mathematically prove an accuracy of 1.0 on arbitrary input. A benchmark score is an estimate over a distribution, not a warranty. A 90% on MMLU means “on this dataset’s distribution, roughly one in ten is wrong,” not “your next question will be right.”
This case weaponizes exactly that gap. Proxy users believed they bought an 84% model and received a 37% one, and had no way to measure the difference themselves. Any attempt to define “performance” mathematically and then have it guaranteed breaks down in two places. First, the object of the guarantee (the whole distribution) is not what the user cares about (my next query). Second, once the model is swapped somewhere mid-supply chain, the number the user measures is itself no longer trustworthy. Mathematics works on the model card; it does not work on the supply chain between the model card and the user.
What changes when the unit becomes economics
If mathematics asks “how accurate is this model,” economics asks “who loses how much when this model is trusted and turns out wrong, and how is that risk priced.” That question fits the 90% discount case far better.
The discount, read as expected value. A price at 10% of list is not a free lunch — it is one variable in an expected-value calculation. Against the 90% in saved cost sits the cost of decision errors from accuracy cut in half, the strategic loss of prompts flowing into a competitor’s training set, and the industrial-espionage risk of source code, API keys, and credentials being exposed to unverified servers. In the language of economics, the “90% discount” is not a price — it is a debt that defers hidden costs into the future.
Information asymmetry and the lemon market. The proxy market is a textbook re-run of George Akerlof’s market for lemons. The seller knows whether they are shipping Opus or Haiku; the buyer does not. When quality cannot be verified, the market competes on price alone and good quality is driven out. The remedy is the one Akerlof prescribed — signaling and verification: official-API certifications like SOC 2, auditable logs, and contracts.
The SLA as a translator. A service-level agreement is precisely the tool that performs this translation. An SLA does not promise “100% correct.” Instead it defines availability, response time, and quality targets as measurable objectives, and specifies financial consequences — refunds, termination rights — for violations. It converts an abstract “performance guarantee” into a concrete, enforceable economic commitment. The fact that the model can be probabilistically wrong is left intact; the contract pins down who carries that risk and how it is compensated.
Implications for production AI
This case is more than a fraud story. For every team running production AI, it forces three things.
First, supply-chain provenance comes before the model card. No benchmark score means anything without a guarantee that the model is actually that model. In a world where model-extraction attacks and substitution are possible, “which model is it” matters less than “did this response come down the path I contracted for” — and the latter has to be verified first.
Second, denominate your reliability budget in money. Compute internally “if this workflow is wrong 5% of the time, how much do we lose,” and the choice of which model, which price, which SLA stops being an article of faith and becomes arithmetic. When the list price of a first-party provider like Anthropic, OpenAI, or Google looks expensive, what that price includes is not just tokens but a provenance guarantee and a no-exfiltration promise.
Third, data leakage is not a one-time cost but a strategic asset transfer. When prompts and reasoning chains feed a competitor’s training run, that is not a single breach — it is a permanent transfer of capability via knowledge distillation. In the language of economics it is less a one-off loss than capital flight.
Insights
The real lesson of the 90% discount Claude case is not the common sense that “cheap things are cheap for a reason.” It is that the problem of model reliability has no answer as long as it stays in the domain of mathematical proof. LLMs are stochastic, benchmarks are estimates over a distribution, and the supply chain is territory the model card does not cover. The demand to “guarantee 100%” is mathematically unfulfillable forever. So the mature answer is to change the unit of the guarantee — from accuracy as a mathematical quantity to expected value, information asymmetry, and contractible risk as economic ones.
This shift is not a concession of defeat; it is a change of tools. Economics has tools for handling uncertainty far older than mathematical proof — insurance, contracts, signaling, reputation, audits. Treat quality and provenance the way an SLA treats availability, and you can accept that “the model can be wrong” while still pinning down “who carries that risk, and at what price.” That is exactly why the 90% discount price tag is dangerous — it looks like a mathematically attractive number, but economically it is a contract that pushes unmeasured debt into the future. The question a production-AI team should be asking next quarter is not “which model is most accurate” but “what is our reliability budget, and from whom and under what contract are we buying it.”
References
Primary reporting on the case
- Korea Management Journal — the identity of the 90% discount Claude proxy — the primary reporting this post is built on
- CISPA Helmholtz Center for Information Security — the German information-security institute that analyzed performance degradation across 17 proxy services
- Anthropic — Claude’s provider and the source of the fraudulent-account detection report
- DeepSeek (Wikipedia) — the Chinese AI company Anthropic accused of unauthorized use of Claude conversation data
Background — evaluation and reliability
- Large language model · Hallucination (AI)
- Benchmark (computing) · MMLU
- Stochastic process · Softmax / temperature
- Model extraction · Knowledge distillation
Background — the economics of risk
- Expected value — the frame for reading a discount as one variable in an EV calculation
- The Market for Lemons · Information asymmetry — how unverifiable quality collapses a market
- Service-level agreement — the tool that translates an abstract performance guarantee into an economic contract
- Industrial espionage · Capital flight — viewing data leakage as a strategic asset transfer
- MLOps · SOC 2 — practical tooling for supply-chain provenance verification
First-party provider pricing
