Back to Models
Moonshot AIProprietary

Kimi K2.6

Kimi K2.6 is a large-scale reasoning model developed by Moonshot AI. It boasts a massive scale with approximately 10 trillion parameters and an extensive context window of 256K.

Parameters

10000.0B

Context Window

256K

License

https://huggingface.co/moonshotai/Kimi-K2-Base/raw/main/LICENSE

Release Date

2026-04-20

API Pricing

Input Price (per 1M tokens)

$0.95

Output Price (per 1M tokens)

$

Billing Mode: standard

Strengths

  • Massive 10-trillion parameter scale
  • Achieves advanced reasoning capabilities
  • Long-context processing of 256K tokens

Weaknesses

  • Closed licensing format
  • Load from massive parameters
  • Lack of detailed performance metrics

Use Cases

  • Complex logical reasoning tasks
  • Ultra-long document analysis
  • Processing advanced specialized knowledge

Deep Analysis

Arena Elo (Text Overall)

1462

#14 provisional on BenchLM; 1529 on Code Arena WebDev (#6 of 67)

SWE-Bench Pro

58.6%

Leads Claude Opus 4.6 (53.4%) and GPT-5.4 (57.7%)

SWE-Bench Verified

80.2%

Effectively tied with Claude (80.8%) and Gemini (80.6%)

GPQA-Diamond

90.5%

vs GPT-5.4: 92.8%, Gemini 3.1 Pro: 94.3%

API Price (Input/Output)

$0.95 / $4.00 per 1M tokens

Moonshot official: $0.60 / $2.50; ~5–25× cheaper than Claude Opus 4.6

Context Window

256K tokens (262,144)

With automatic compression; supports 12-hour autonomous sessions

Strengths

  • Best-in-class agentic coding performance: leads SWE-Bench Pro (58.6%), HLE-Full with tools (54.0%), and DeepSearchQA (92.5 f1) among all models tested
  • Unmatched cost efficiency: 5–25× cheaper than proprietary frontier models with open-weight self-hosting under Modified MIT license
  • Native 300-agent swarm orchestration with 4,000 coordinated steps enables multi-day autonomous engineering workflows no competitor replicates

Weaknesses

  • Lags 3–5 points behind GPT-5.4 and Gemini on pure reasoning benchmarks (HLE-Full without tools: 34.7 vs 39.8/44.4; AIME: 96.4 vs 99.2)
  • Requires minimum 8×H100-80G GPUs for self-hosting (595 GB weights), making local deployment impractical for smaller teams
  • Higher hallucination rate (39.26%) than GPT-5.4 on general knowledge benchmarks, though significantly improved from K2.5 (64.6%)

Competitor Comparison

ModelArenaGPQAPrice
Claude Opus 4.61548–156591.3%$15/$75 per 1M
GPT-5.4 (xhigh)N/A92.8%$2.50/$15 per 1M
Gemini 3.1 ProN/A94.3%~$1.25/$5 per 1M

Analysis generated: 2026-05-23