GPT-5.4
GPT-5.4 is the latest multimodal reasoning model in the GPT-5 series developed by OpenAI. It is optimized for complex knowledge work, software engineering, and long-context analysis up to 1 million tokens.
파라미터
Undisclosed
컨텍스트
라이선스
Proprietary
출시일
2026-03-05
API 가격
입력 가격 (1M 토큰당)
$2.5
출력 가격 (1M 토큰당)
$15
과금 모드: standard
강점
- ・Advanced software development capabilities
- ・Long-context analysis up to 1 million tokens
- ・Industry-leading on key evaluation benchmarks
약점
- ・Model weights are not public
- ・Closed-source license system
- ・Incurs API usage costs
활용 사례
- ・Advanced software engineering support
- ・Business tasks with complex specialized knowledge
- ・Ultra-long context analysis
심층 분석
Arena Elo
1467
#9 overall
SWE-Bench Verified
80.8%
vs Claude Opus 4.7: 87.6%
SWE-Bench Pro
57.7%
vs Claude Opus 4.7: 64.3%
GPQA Diamond
94.4%
near parity with frontier
Input Price
$2.50/1M
vs Claude Opus 4.7: $5/1M
Context Window
1.05M tokens
largest in GPT-5 family
강점
- ・Best-in-class web research (BrowseComp 89.3%)
- ・Strong terminal-heavy coding (Terminal-Bench 75.1%)
- ・Competitive pricing at $2.50/$15 per 1M tokens
- ・1M context window with strong retrieval
약점
- ・Trails Claude Opus 4.7 on agentic coding (SWE-bench Pro 57.7% vs 64.3%)
- ・Weaker multi-tool orchestration than Claude (MCP-Atlas 68.1% vs 77.3%)
- ・Computer use below Claude Opus 4.7 (OSWorld 75.0% vs 78.0%)
경쟁사 비교
| Model | Arena | SWE | GPQA | Price |
|---|---|---|---|---|
| Claude Opus 4.7 | ~1500 | 87.6% | 94.2% | $5/$25 |
| Gemini 3.1 Pro | ~1485 | 84.2% | 94.3% | $2/$12 |
| GPT-5.4 | 1467 | 80.8% | 94.4% | $2.50/$15 |
GPT-5.4 is OpenAI's flagship model released on March 5, 2026, positioned as a versatile frontier model for professional work. Built on the GPT-5 architecture with explicit chain-of-thought reasoning, it offers a 1.05M token context window and competitive benchmark performance across coding, reasoning, and knowledge tasks.
The model's strongest category is knowledge work (#1 on BenchLM), making it particularly effective for research, analysis, and factual Q&A. It excels at web research with an 89.3% BrowseComp score and handles terminal-heavy coding well at 75.1% on Terminal-Bench 2.0. At $2.50/$15 per 1M tokens, it offers strong value compared to Claude Opus 4.7's $5/$25 pricing.
However, GPT-5.4 trails Claude Opus 4.7 in agentic coding scenarios (57.7% vs 64.3% on SWE-bench Pro) and multi-tool orchestration (68.1% vs 77.3% on MCP-Atlas). For teams prioritizing web research, terminal workflows, or cost efficiency, GPT-5.4 remains the better choice.
분석 생성일: 2026-05-24