Gemini 3.1 Pro Preview
Gemini 3.1 Pro Preview is a multimodal foundation model developed by Google DeepMind. It features a vast context window of 1 million tokens.
Parameters
Undisclosed
Context Window
1M
License
Proprietary
Release Date
2026-02-20
API Pricing
Input Price (per 1M tokens)
$3.6
Output Price (per 1M tokens)
$21.6
Billing Mode: standard
Strengths
- ・Advanced multimodal capabilities
- ・One-million-token long context
- ・Developed by Google DeepMind
Weaknesses
- ・Closed-source license
- ・Instability as a preview version
- ・Lack of detailed performance metrics
Use Cases
- ・Large-scale document analysis and summarization
- ・Complex multimodal processing
- ・Leveraging long-context data
Deep Analysis
Arena Elo
~1493
#3 overall (April 2026), trailing Claude Opus 4.6 (1504)
SWE-Bench Verified
80.6%
vs Claude Opus 4.6: 80.8% (0.2pp gap)
GPQA Diamond
94.3%
vs GPT-5.4: 92.0% (current leader)
ARC-AGI-2
77.1%
2.5x improvement over Gemini 3 Pro (31.1%)
Input Price
$2/1M
for ≤200k context, cheapest frontier model
Context Window
1M tokens
64K output, supports entire codebases
Strengths
- ・Leads reasoning and multimodal benchmarks at a fraction of competitor costs (64% cheaper than Claude Opus 4.6 max reasoning).
- ・Massive 1M token context window enables whole codebase and long-document analysis in single requests.
- ・Significant agentic capability improvements with BrowseComp +45% and dedicated customtools endpoint for tool-heavy workflows.
Weaknesses
- ・Still trails Claude Opus 4.6 in pure coding accuracy (SWE-Bench 80.6% vs 80.8%).
- ・Preview status means no production SLA guarantees and potential API behavior changes before GA.
- ・Higher latency than some competitors and more confident when wrong (calibration error 51 vs GPT-5.4's 38 on Humanity's Last Exam).
Competitor Comparison
| Model | Arena | SWE | GPQA | Price |
|---|---|---|---|---|
| Claude Opus 4.6 | ~1504 | 80.8% | 89.6% | $5/$25 per 1M tokens |
| GPT-5.4 | ~1484 | ~80% | 92.0% | $2.50/$15 per 1M tokens |
| DeepSeek R2 | ~1441 | 62.1% | 82.4% | $0.55/$2.19 per 1M tokens |
Gemini 3.1 Pro Preview is Google DeepMind's latest flagship multimodal reasoning model, launched in February 2026 as an iterative upgrade to the Gemini 3 series. It represents a significant step forward in core reasoning capabilities while maintaining aggressive pricing, positioning Google as the leader in cost-efficient frontier AI. The model features a massive 1 million token context window, native multimodal understanding across text, images, audio, video, and code, and is specifically optimized for complex agentic workflows, advanced coding, and long-context analysis.
The model achieves state-of-the-art or near-state-of-the-art performance across multiple benchmarks, leading on GPQA Diamond (94.3%), ARC-AGI-2 (77.1%), and Humanity's Last Exam (44.4% without tools). It ties GPT-5.4 on the Artificial Analysis Intelligence Index with a score of 57 while costing less than half to run the full evaluation suite. Key improvements over its predecessor include a 2.5x improvement on abstract reasoning (ARC-AGI-2), a 45% boost in search capabilities (BrowseComp), and a 20% improvement in terminal coding (Terminal-Bench 2.0).
Currently in preview status, Gemini 3.1 Pro is available across Google's ecosystem including Gemini API, Vertex AI, Google AI Studio, and consumer products like the Gemini app and NotebookLM. Its pricing structure makes it the most cost-effective frontier model available, with input costs at $2 per million tokens for prompts under 200k context—significantly cheaper than competitors like Claude Opus 4.6 ($5/$25) while delivering comparable or superior performance on most benchmarks. The model represents Google's strategy of delivering frontier capabilities at accessible price points, though the preview status means developers should validate performance before production deployment.
Sources
- Gemini 3.1 Pro - Model Card — Google DeepMind
- Gemini 3.1 Pro Preview: The new leader in AI
- Gemini 3.1 Pro: Pricing, Context Window, Benchmarks, API & More
- Gemini 3.1 Pro Preview: Benchmarks, What Changed, and Who Should Switch
- Gemini 3.1 Pro Benchmarks Decoded: GPQA 94.3%, SWE 80.6%, Full Results
- Gemini 3.1 Pro vs 3.0 Pro Preview Full Comparison
- Gemini 3.1 Pro Preview vs Gemini 3.0 Pro (Preview 11-2025)
- Gemini 3.1 Pro: Announcing our latest Gemini AI model
Analysis generated: 2026-05-23