Press Kit: AI API Token Price Decay 2022-2026
Dataset summary
- Scope: 12 frontier model families across 32 priced versions (GPT-3.5, GPT-4, GPT-4 Turbo, GPT-4o, GPT-4o mini, GPT-5; Claude 1, Claude 2, Claude 3 Haiku/Sonnet/Opus, Claude 3.5 Sonnet/Haiku, Claude 4 Opus/Sonnet/Haiku; Gemini 1.0 Pro, 1.5 Pro, 1.5 Flash, 2.0 Flash; Llama 2/3/3.1; DeepSeek V2/V3; Mistral Large 2)
- Time range: 2022-11 to 2026-05 (full 42-month window)
- Geographic scope: Global (USD list pricing on standard tier)
- Method: First-party scrape of provider pricing pages on or near the effective date, with archive.org Wayback Machine snapshots cited where the live page no longer reflects the historical price
- License: CC-BY 4.0; full per-version dataset available as machine-readable JSON with primary source URL plus archive URL on every row
- Lead author: Vincent, Nesyona Research
Five quotable findings
"The cost of a standard 1,000-input, 500-output AI task fell roughly one thousand times on the frontier-mini tier between March 2023 and February 2025. GPT-4 32K cost nine cents per task; Gemini 2.0 Flash now costs about two-hundredths of a cent. The decay rate is fastest at the bottom of the price ladder, not the top."
Attribute to: Vincent, Nesyona Research, in Nesyona's 2026 AI API Token Price Decay study
"Claude Opus has not changed price in 26 months. Anthropic priced Opus 3 at fifteen dollars input and seventy-five dollars output per million tokens in March 2024 and shipped Claude 4 Opus at the identical numbers in May 2025. Across the same window, every other frontier flagship has cut list price at least once. Anthropic chose to ship a faster model at the same price rather than cut price on the flagship."
Attribute to: Vincent, Nesyona Research
"Gemini Flash sets the price floor every release. At the May 2024 launch, the next cheapest closed-weight option was three times more expensive. Google is the only frontier-model vendor that owns its full TPU stack and can amortize silicon across consumer Search workloads, which lets Flash list price approach marginal compute cost rather than carrying a sales-cycle margin."
Attribute to: Vincent, Nesyona Research
"Open-weight hosted pricing leads each closed-weight cost cut by roughly one quarter. DeepSeek V2 priced at fourteen cents per million input tokens in May 2024. Two months later GPT-4o mini matched at fifteen cents. Together AI and Fireworks publish open-weight pricing the week the weights become public, which establishes a marginal-cost reference rate that closed-weight providers then chase at the next product launch."
Attribute to: Vincent, Nesyona Research
"Output is now eight times more expensive than input on the GPT-5 tier. GPT-3.5 was one-to-one. The widening reflects compute economics: prefill is parallelizable across the prompt while generation is sequential. Builders running output-heavy workloads (writing, reasoning, code generation) face a meaningfully smaller cost cut than the headline thousand-x decay implies."
Attribute to: Vincent, Nesyona Research
Suggested coverage angles
This study lands at the intersection of AI infrastructure economics, vendor strategy, and developer tooling. Suggested angles for journalists:
- Vendor strategy: Anthropic Opus sticky-pricing as a deliberate positioning bet (Stratechery, The Information).
- Compute economics: Why output is now 8x input price, and what that means for code-generation agents (Latent Space, Pragmatic Engineer).
- Open-weight catch-up: DeepSeek V2 leading GPT-4o mini by 73 days, structural rather than coincidental (TechCrunch, The Information).
- The 1000x frame: Per-task cost decay numbers suitable for Stat-of-AI follow-on coverage and Stanford HAI AI Index citation.
- Cultural note: The "free tier inflation" angle for Garbage Day; the "what falling AI prices mean for incumbent SaaS" angle for Marginal Revolution.
Suggested press targets
- Kyle Wiggers, TechCrunch (AI infrastructure beat)
- Stephanie Palazzolo, The Information (frontier-model business strategy)
- Ben Thompson, Stratechery (Anthropic positioning, Google TPU economics)
- swyx, Latent Space (developer-facing implications)
- Gergely Orosz, The Pragmatic Engineer (engineer cost implications)
- Ryan Broderick, Garbage Day (cultural and consumer-facing AI cost angle)
- Tyler Cowen, Marginal Revolution (economic implications, productivity)
- Stanford HAI communications (AI Index methodology cross-reference)
- Nathan Benaich, Air Street Capital / Stat-of-AI (annual report citation)
Downloads
- Full study (HTML): nesyona.com/research/ai-token-price-decay-2026
- Open dataset (JSON): data.json [CC-BY 4.0, 32 versions, primary URL + archive URL per row]
- Methodology PDF: [PDF-PENDING] (available on request via media contact below)
- Full chart pack (PNG + SVG): [CHART-PACK-PENDING] (available on request)
- Companion study: AI Tools Statistics 2026
Embed the headline chart
Free to embed under CC-BY 4.0 with link attribution to the study page:
<iframe src="https://nesyona.com/research/ai-token-price-decay-2026/embed/decay-curve.html" width="100%" height="400" frameborder="0" loading="lazy" title="AI API token price decay 2022 to 2026, Nesyona Research"> </iframe>
Media contact
Vincent, Nesyona Research
Email: [email protected] [PRESS-EMAIL-PENDING]
Available for: written quotes (24 hour turnaround), data drill-down requests, podcast or video interview, on-record commentary on AI API pricing trends, vendor strategy reads, and developer-cost implications.
Response time: typically same-day during ET business hours.
About Nesyona
Nesyona is an independent AI tool economy research site at nesyona.com. We track real pricing, free-tier longevity, capability boundaries, and cost-per-outcome across the AI product stack. Nesyona is part of the DeepSynthesis Lattice, a constellation of niche research sites covering ecommerce, finance, education, health, and AI verticals.