모델 목록으로
OpenAI독점

GPT-5.4

GPT-5.4 is the latest multimodal reasoning model in the GPT-5 series developed by OpenAI. It is optimized for complex knowledge work, software engineering, and long-context analysis up to 1 million tokens.

파라미터

Undisclosed

컨텍스트

라이선스

Proprietary

출시일

2026-03-05

API 가격

입력 가격 (1M 토큰당)

$2.5

출력 가격 (1M 토큰당)

$15

과금 모드: standard

강점

  • Advanced software development capabilities
  • Long-context analysis up to 1 million tokens
  • Industry-leading on key evaluation benchmarks

약점

  • Model weights are not public
  • Closed-source license system
  • Incurs API usage costs

활용 사례

  • Advanced software engineering support
  • Business tasks with complex specialized knowledge
  • Ultra-long context analysis

심층 분석

Arena Elo

1467

#9 overall

SWE-Bench Verified

80.8%

vs Claude Opus 4.7: 87.6%

SWE-Bench Pro

57.7%

vs Claude Opus 4.7: 64.3%

GPQA Diamond

94.4%

near parity with frontier

Input Price

$2.50/1M

vs Claude Opus 4.7: $5/1M

Context Window

1.05M tokens

largest in GPT-5 family

강점

  • Best-in-class web research (BrowseComp 89.3%)
  • Strong terminal-heavy coding (Terminal-Bench 75.1%)
  • Competitive pricing at $2.50/$15 per 1M tokens
  • 1M context window with strong retrieval

약점

  • Trails Claude Opus 4.7 on agentic coding (SWE-bench Pro 57.7% vs 64.3%)
  • Weaker multi-tool orchestration than Claude (MCP-Atlas 68.1% vs 77.3%)
  • Computer use below Claude Opus 4.7 (OSWorld 75.0% vs 78.0%)

경쟁사 비교

ModelArenaSWEGPQAPrice
Claude Opus 4.7~150087.6%94.2%$5/$25
Gemini 3.1 Pro~148584.2%94.3%$2/$12
GPT-5.4146780.8%94.4%$2.50/$15

GPT-5.4 is OpenAI's flagship model released on March 5, 2026, positioned as a versatile frontier model for professional work. Built on the GPT-5 architecture with explicit chain-of-thought reasoning, it offers a 1.05M token context window and competitive benchmark performance across coding, reasoning, and knowledge tasks.

The model's strongest category is knowledge work (#1 on BenchLM), making it particularly effective for research, analysis, and factual Q&A. It excels at web research with an 89.3% BrowseComp score and handles terminal-heavy coding well at 75.1% on Terminal-Bench 2.0. At $2.50/$15 per 1M tokens, it offers strong value compared to Claude Opus 4.7's $5/$25 pricing.

However, GPT-5.4 trails Claude Opus 4.7 in agentic coding scenarios (57.7% vs 64.3% on SWE-bench Pro) and multi-tool orchestration (68.1% vs 77.3% on MCP-Atlas). For teams prioritizing web research, terminal workflows, or cost efficiency, GPT-5.4 remains the better choice.

분석 생성일: 2026-05-24