Pricing2026-05-26

AI API Pricing Guide May 2026: Comparing Every Frontier Model

AI API pricing has fluctuated rapidly entering 2026.

At the start of 2025, the prevailing assumption was that "frontier models = expensive." However, as of May 2026, models achieving over 80% on the SWE-bench—a benchmark for software engineering capability—are now available for as little as $0.30 (input) / $1.20 (output).

Here is the comprehensive breakdown of current API pricing for major models.

Frontier Models (SWE-bench > 80%)

Model	Developer	Input /1M	Output /1M	SWE-bench	Context
Claude Mythos Preview	Anthropic	Undisclosed	Undisclosed	93.9%	1M
Claude Opus 4.7	Anthropic	$5.00	$25.00	87.6%	200K
Claude Opus 4.6	Anthropic	$5.00	$25.00	80.8%	1M
Gemini 3.1 Pro	Google	$2.50	$15.00	80.6%	1M
DeepSeek V4 Pro (Max)	DeepSeek	$1.74	$3.48	80.6%	1M
Kimi K2.6	Moonshot AI	$0.95	$4.00	80.2%	256K
MiniMax M2.5	MiniMax	$0.30	$1.20	80.2%	200K
GPT-5.2	OpenAI	$1.25	$10.00	80.0%	256K

High-Performance Models (SWE-bench 75-80%)

Model	Developer	Input /1M	Output /1M	SWE-bench	Context
Claude Sonnet 4.6	Anthropic	$3.00	$15.00	79.6%	1M
DeepSeek V4 Flash (Max)	DeepSeek	$0.14	$0.28	79.0%	1M
Qwen3.6 Plus	Alibaba	$0.50	$3.00	78.8%	1M
MiMo-V2-Pro	Xiaomi	$0.50	$3.00	78.0%	1M
Mistral Medium 3.5	Mistral	$1.50	$7.50	77.6%	256K
GLM-5	Zhipu AI	$1.00	$3.20	77.8%	200K

Maximum Cost-Efficiency

To determine the best value, we calculate the "Score per Dollar" by dividing the SWE-bench score by the output price:

Model	Score	Output /1M	Score / Dollar	Note
DeepSeek V4 Flash (Max)	79.0%	$0.28	282.1	Unbeatable value
MiniMax M2.5	80.2%	$1.20	66.8	Cheapest Top-10 model
DeepSeek V4 Pro (Max)	80.6%	$3.48	23.2	Cheapest Frontier class
Kimi K2.6	80.2%	$4.00	20.1	Moonshot AI flagship
Qwen3.6 Plus	78.8%	$3.00	26.3	Alibaba flagship
MiMo-V2-Pro	78.0%	$3.00	26.0	Xiaomi flagship
GPT-5.2	80.0%	$10.00	8.0	OpenAI flagship
Gemini 3.1 Pro	80.6%	$15.00	5.4	Google flagship
Claude Sonnet 4.6	79.6%	$15.00	5.3	Anthropic flagship
Claude Opus 4.7	87.6%	$25.00	3.5	Ultra-high performance

DeepSeek V4 Flash (Max) is in a league of its own. With a score of 79.0% at an output price of just $0.28, its cost-performance is 80 times higher than that of Opus 4.7.

Monthly Cost Estimation

Estimated cost for 10 million monthly tokens (5M input + 5M output):

Model	Monthly Cost	Primary Use Case
DeepSeek V4 Flash (Max)	$2.10	Lightweight tasks
MiniMax M2.5	$7.50	Coding automation
DeepSeek V4 Pro (Max)	$26.10	Frontier-level workloads
Kimi K2.6	$24.75	Frontier-level workloads
GPT-5.2	$56.25	General OpenAI ecosystem
Gemini 3.1 Pro	$87.50	General Google ecosystem
Claude Sonnet 4.6	$90.00	General Anthropic ecosystem
Claude Opus 4.7	$150.00	Maximum performance

For 10 million tokens, the cost difference between the most affordable (DeepSeek V4 Flash) and the most expensive (Opus 4.7) is 71x.

The Critical Role of Context Caching

Several providers now offer cached input pricing, which is transformative for agentic workflows:

Model	Standard Input	Cached Input	Discount
Gemini 3.5 Flash	$1.50	$0.15	90%
Gemini 3.1 Pro	$2.50	$0.25 (est)	90%
DeepSeek V4 Pro	$1.74	$0.44	75%
MiniMax M2.7	$0.30	$0.06	80%

For agents that reread the same context repeatedly, caching can reduce costs by over 90%. Gemini 3.5 Flash's $0.15/1M cached price makes it the most cost-efficient choice for agent-heavy deployments.

Historical Price Trends

Comparing pricing from early 2025 to May 2026:

Model	Early 2025	Late 2025	May 2026	Change
Claude Opus	$15/$75	$5/$25	$5/$25	Input down 67%
GPT-4o	$5/$15	—	—	Migrated to GPT-5.2
GPT-5.2	—	—	$1.25/$10	New price tier
Gemini Pro	$7/$21	$2.50/$15	$2.50/$15	Input down 64%
DeepSeek V4	—	—	$1.74/$3.48	New entrant

There is a clear trend of plummeting input prices. Comparing early 2025 Claude Opus ($15/$75) to current Opus 4.7 ($5/$25), the input cost has dropped to one-third. Output prices have fallen more slowly, likely because the value per output token increases as generation quality improves.

Model Selection Guide

Use Case	Recommended Model	Reason
Coding (Best Quality)	Claude Opus 4.7	SWE-bench 87.6%
Coding (Best Value)	MiniMax M2.5	80.2% at $0.30/$1.20
Coding (Cheapest)	DeepSeek V4 Flash	79.0% at $0.14/$0.28
Long Context	Gemini 3.1 Pro	1M context window
AI Agents	Gemini 3.5 Flash	$0.15 cache, 4x faster
Reasoning/Math	GPT-5.5 Pro	Top FrontierMath score
Multilingual/Japanese	Claude Sonnet 4.6	High linguistic quality
Self-Hosting/Local	DeepSeek V4	Open-source availability
Strict Budget	MiniMax M2.5	Lowest cost among Top-10

Outlook for H2 2026

Prices will continue to drop. With DeepSeek V4 Flash hitting $0.14/$0.28, we have entered a zone where costs are almost negligible. Other providers will likely follow suit.

Caching will become the new competitive frontier. As agentic workloads scale, the difference in cached pricing will directly dictate the ROI of AI implementations.

The gap between "Cheapest" and "Best" will widen. The price difference between a 93.9% model like Mythos and a 79% model like DeepSeek V4 Flash is over 100x. Users must now be more precise in managing the trade-off between quality and cost.

Summary

As of May 2026, the AI API market is more diversified than ever. We see a spectrum ranging from the ultra-economical DeepSeek V4 Flash ($0.14/$0.28) to the premium, undisclosed pricing of Claude Mythos.

Critically, the performance gap between the cheapest and the best models is shrinking. The fact that 80%+ SWE-bench performance is now available for $0.30/$1.20 signals the true democratization of frontier-level AI.

The central question for developers has shifted from "Which model is the best?" to "Which model is the most optimal for my specific use case?"

Comments (0)

Share:X Hatena

Back to Blog