Back to Leaderboard
MMLU-Pro Massive Multitask Language Understanding Pro — 幅広い知識分野の理解能力を測定 LMArena Elo LMArena(旧Chatbot Arena)のEloレーティング — ユーザー匿名盲テストによる総合評価
Comprehensive RankingCoding CapabilityMath CapabilityAI Agent CapabilityReasoning CapabilityGeneral PerformanceOpenClaw Ranking
General Performance
General AI performance: MMLU-Pro, LMArena Elo ratings.
698 models
| # | Model | Developer | Open Source | ||
|---|---|---|---|---|---|
| 1 | OpenAI o1 | OpenAI | 91.0 | 1.0 | Closed |
| 2 | Gemini 3.0 Pro (Preview 11-2025) | Google DeepMind | 90.0 | 1.0 | Closed |
| 3 | Opus 4.5 | Anthropic | 90.0 | — | Closed |
| 4 | Qwen3.7-Max-Preview | アリババ | 89.6 | — | Closed |
| 5 | Qwen 3.6 Plus Preview | アリババ | 88.5 | 1.0 | Closed |
| 6 | Qwen3.6-Max-Preview | アリババ | 88.5 | 1.0 | Closed |
| 7 | Claude Sonnet 4.5 | Anthropic | 88.0 | 1.0 | Closed |
| 8 | M2.1 | MiniMax | 88.0 | 1.0 | Closed |
| 9 | Opus 4.1 | Anthropic | 88.0 | 1.0 | Closed |
| 10 | Qwen3.5-397B-A17B | アリババ | 87.8 | 1.0 | Closed |
| 11 | Hunyuan-T1 | テンセントAI研究所 | 87.2 | 1.0 | Closed |
| 12 | DeepSeek-V4-Pro | DeepSeek | 87.1 | 1.0 | Closed |
| 13 | Grok 4 | xAI | 87.0 | 1.0 | Closed |
| 14 | DeepSeek-V4-Flash | DeepSeek | 86.2 | 1.0 | Closed |
| 15 | Qwen3.6-27B | アリババ | 86.2 | — | Closed |
| 16 | Qwen3.5-27B | アリババ | 86.1 | 1.0 | Closed |
| 17 | GPT-4.5 | OpenAI | 86.1 | 1.0 | Closed |
| 18 | Gemini 2.5-Pro | Google DeepMind | 86.0 | — | Closed |
| 19 | Qwen3-Max-Thinking | アリババ | 85.7 | — | Closed |
| 20 | OpenAI o3 | OpenAI | 85.6 | 1.0 | Closed |
| 21 | Gemma 4 31B | Google DeepMind | 85.2 | 1.0 | Closed |
| 22 | Qwen3.6-35B-A3B | アリババ | 85.2 | — | Closed |
| 23 | DeepSeek-V3.1 Terminus | DeepSeek | 85.0 | 1.0 | Closed |
| 24 | DeepSeek V3.2-Exp | DeepSeek | 85.0 | 1.0 | Closed |
| 25 | DeepSeek-R1-0528 | DeepSeek | 85.0 | 1.0 | Closed |
| 26 | Grok 4.1 Fast | xAI | 85.0 | — | Closed |
| 27 | DeepSeek-V3.1 | DeepSeek | 85.0 | 1.0 | Closed |
| 28 | Claude Opus 4 | Anthropic | 85.0 | 1.0 | Closed |
| 29 | GLM-4.5 | Zhipu AI | 84.6 | 1.0 | Closed |
| 30 | Claude Mythos Preview | Anthropic | — | — | Closed |