Nesyona Research // Press Kit

Press Kit: AI API Token Price Decay 2022-2026

Published 2026-05-09 · Embargo: none, available for immediate publication · Read the full study

Dataset summary

Scope: 12 frontier model families across 32 priced versions (GPT-3.5, GPT-4, GPT-4 Turbo, GPT-4o, GPT-4o mini, GPT-5; Claude 1, Claude 2, Claude 3 Haiku/Sonnet/Opus, Claude 3.5 Sonnet/Haiku, Claude 4 Opus/Sonnet/Haiku; Gemini 1.0 Pro, 1.5 Pro, 1.5 Flash, 2.0 Flash; Llama 2/3/3.1; DeepSeek V2/V3; Mistral Large 2)
Time range: 2022-11 to 2026-05 (full 42-month window)
Geographic scope: Global (USD list pricing on standard tier)
Method: First-party scrape of provider pricing pages on or near the effective date, with archive.org Wayback Machine snapshots cited where the live page no longer reflects the historical price
License: CC-BY 4.0; full per-version dataset available as machine-readable JSON with primary source URL plus archive URL on every row
Lead author: Vincent, Nesyona Research

Five quotable findings

"The cost of a standard 1,000-input, 500-output AI task fell roughly one thousand times on the frontier-mini tier between March 2023 and February 2025. GPT-4 32K cost nine cents per task; Gemini 2.0 Flash now costs about two-hundredths of a cent. The decay rate is fastest at the bottom of the price ladder, not the top."

Attribute to: Vincent, Nesyona Research, in Nesyona's 2026 AI API Token Price Decay study

"Claude Opus has not changed price in 26 months. Anthropic priced Opus 3 at fifteen dollars input and seventy-five dollars output per million tokens in March 2024 and shipped Claude 4 Opus at the identical numbers in May 2025. Across the same window, every other frontier flagship has cut list price at least once. Anthropic chose to ship a faster model at the same price rather than cut price on the flagship."

Attribute to: Vincent, Nesyona Research

"Gemini Flash sets the price floor every release. At the May 2024 launch, the next cheapest closed-weight option was three times more expensive. Google is the only frontier-model vendor that owns its full TPU stack and can amortize silicon across consumer Search workloads, which lets Flash list price approach marginal compute cost rather than carrying a sales-cycle margin."

Attribute to: Vincent, Nesyona Research

"Open-weight hosted pricing leads each closed-weight cost cut by roughly one quarter. DeepSeek V2 priced at fourteen cents per million input tokens in May 2024. Two months later GPT-4o mini matched at fifteen cents. Together AI and Fireworks publish open-weight pricing the week the weights become public, which establishes a marginal-cost reference rate that closed-weight providers then chase at the next product launch."

Attribute to: Vincent, Nesyona Research

"Output is now eight times more expensive than input on the GPT-5 tier. GPT-3.5 was one-to-one. The widening reflects compute economics: prefill is parallelizable across the prompt while generation is sequential. Builders running output-heavy workloads (writing, reasoning, code generation) face a meaningfully smaller cost cut than the headline thousand-x decay implies."

Attribute to: Vincent, Nesyona Research

Suggested coverage angles

This study lands at the intersection of AI infrastructure economics, vendor strategy, and developer tooling. Suggested angles for journalists:

Vendor strategy: Anthropic Opus sticky-pricing as a deliberate positioning bet (Stratechery, The Information).
Compute economics: Why output is now 8x input price, and what that means for code-generation agents (Latent Space, Pragmatic Engineer).
Open-weight catch-up: DeepSeek V2 leading GPT-4o mini by 73 days, structural rather than coincidental (TechCrunch, The Information).
The 1000x frame: Per-task cost decay numbers suitable for Stat-of-AI follow-on coverage and Stanford HAI AI Index citation.
Cultural note: The "free tier inflation" angle for Garbage Day; the "what falling AI prices mean for incumbent SaaS" angle for Marginal Revolution.

Suggested press targets

Kyle Wiggers, TechCrunch (AI infrastructure beat)
Stephanie Palazzolo, The Information (frontier-model business strategy)
Ben Thompson, Stratechery (Anthropic positioning, Google TPU economics)
swyx, Latent Space (developer-facing implications)
Gergely Orosz, The Pragmatic Engineer (engineer cost implications)
Ryan Broderick, Garbage Day (cultural and consumer-facing AI cost angle)
Tyler Cowen, Marginal Revolution (economic implications, productivity)
Stanford HAI communications (AI Index methodology cross-reference)
Nathan Benaich, Air Street Capital / Stat-of-AI (annual report citation)

Downloads

Full study (HTML): nesyona.com/research/ai-token-price-decay-2026
Open dataset (JSON): data.json [CC-BY 4.0, 32 versions, primary URL + archive URL per row]
Methodology PDF: [PDF-PENDING] (available on request via media contact below)
Full chart pack (PNG + SVG): [CHART-PACK-PENDING] (available on request)
Companion study: AI Tools Statistics 2026

Embed the headline chart

Free to embed under CC-BY 4.0 with link attribution to the study page:

<iframe src="https://nesyona.com/research/ai-token-price-decay-2026/embed/decay-curve.html"
  width="100%" height="400" frameborder="0" loading="lazy"
  title="AI API token price decay 2022 to 2026, Nesyona Research">
</iframe>

Media contact

Vincent, Nesyona Research

Email: [email protected] [PRESS-EMAIL-PENDING]

Available for: written quotes (24 hour turnaround), data drill-down requests, podcast or video interview, on-record commentary on AI API pricing trends, vendor strategy reads, and developer-cost implications.

Response time: typically same-day during ET business hours.

About Nesyona

Nesyona is an independent AI tool economy research site at nesyona.com. We track real pricing, free-tier longevity, capability boundaries, and cost-per-outcome across the AI product stack. Nesyona is part of the DeepSynthesis Lattice, a constellation of niche research sites covering ecommerce, finance, education, health, and AI verticals.