Model Comparison
Compare popular AI models by performance, price, and features
VS
GPQA DiamondVS
ARC-AGI-2VS
AgenticVS
Price/PerfVS
Open-WeightLlama-3-Namazu-405B
Sakana AI
MMLU (5-shot): 88.6%
HumanEval (Coding): 89.0%
Input Price: $2.40/M tokens
GPT-5.1 Codex Max
OpenAI
SWE-bench Verified (xhigh): 77.9%
SWE-Lancer IC SWE: 79.9%
Terminal-Bench 2.0: 58.1%
VS
CodingGrok 4.2 Beta
xAI
Chatbot Arena Elo: ~1493
IFBench (Instruction Following): 83%
Omniscience (Non-Hallucination): 78%
VS
Real-timeVS
Local Deploy