Back to Blog
OpenAI

OpenAI Unveils GPT-5.6 Series: Sol Model Introduces Revolutionary Ultra Mode for Sub-Agent Collaboration

OpenAI has initiated a limited preview of its next-generation model series, GPT-5.6. The suite comprises three distinct models: the flagship Sol, the balanced Terra, and the high-speed, cost-effective Luna. A key highlight is the industry-first "ultra mode," designed to enable sophisticated sub-agent coordination.

GPT-5.6 Series Overview

The GPT-5.6 series offers three models tailored for different needs:

ModelRoleInput PriceOutput PriceKey Feature
SolFlagship$5/1M tokens$30/1M tokensHighest performance, max reasoning mode
TerraBalanced$2.50/1M tokens$15/1M tokensGPT-5.5 level performance, 50% cost reduction
LunaHigh-speed, low-cost$1/1M tokens$6/1M tokensMost affordable, feature-rich option

Terra maintains performance parity with GPT-5.5 while cutting costs by half. Luna represents OpenAI's most affordable offering to date.

The Innovative "Ultra Mode"

The most groundbreaking feature of GPT-5.6 is the new ultra mode. Moving beyond the limitations of a single agent, it leverages sub-agents to execute complex tasks in parallel.

This enables:

  • Faster completion of large-scale refactoring jobs
  • Simultaneous analysis across multiple codebases
  • Enhanced productivity during long-running agent sessions

Additionally, a "max" reasoning mode has been introduced, allowing the model extended time for deeper, more thorough thinking.

Benchmark Results

Coding Prowess

GPT-5.6 Sol has achieved a new State-of-the-Art (SOTA) on Terminal-Bench 2.1. This benchmark comprehensively tests planning, iteration, and tool coordination in command-line workflows, and Sol recorded the highest score.

Cybersecurity

The most dramatic advances are in the cybersecurity domain:

  • ExploitBench²: Achieved performance comparable to Mythos Preview while using approximately 1/3 of the output tokens.
  • ExploitGym 3: All models (Sol, Terra, Luna) showed significant capability improvements correlated with increased reasoning intensity.
  • Evaluations on Chromium and Firefox demonstrated the ability to identify bugs and exploit primitives.

OpenAI notes, "GPT-5.6 Sol is excellent at helping people find and fix vulnerabilities, but it is not yet reliable for executing end-to-end attacks."

Biology

On GeneBench v1, GPT-5.6 achieves stronger results than GPT-5.5 while using fewer tokens, showing excellent performance in long-horizon analysis for genomics and quantitative biology.

Enhanced Security Measures

GPT-5.6 Sol features OpenAI's most robust security stack to date:

  1. Model-level: Trained to refuse requests for harmful cyber assistance.
  2. Real-time classifiers: Detect cyber/bio exploitation during generation.
  3. Account-level: Pattern analysis across multiple conversations.
  4. Staged access control: Limited distribution to trusted partners.

A particularly notable safety feature is pointerization detection. A large reasoning model reviews the conversation context; if a potential violation is detected, generation is temporarily halted.

Road Ahead

  • Now: Limited preview (trusted partners only)
  • Coming weeks: General availability planned
  • July 2026: High-speed inference at 750 tokens/sec on Cerebras to begin

OpenAI has stated that "cooperation with governments will not become a long-term default," emphasizing this is a temporary arrangement.

Competitive Landscape

The arrival of GPT-5.6 Sol intensifies the frontier model race:

  • Claude Opus 4.8: Maintains the top spot on the Intelligence Index (61.4).
  • GPT-5.6 Sol: Achieves SOTA in coding and cybersecurity.
  • Gemini 3.5 Pro: Google is set for GA this month.

Notably, Sol's ultra mode represents an approach that transcends single-model performance, potentially influencing strategies across other providers.

Conclusion

GPT-5.6 Sol represents more than just a performance leap; it presents a new paradigm of agent coordination. The sub-agent utilization via ultra mode could define the future trajectory of AI agent development.

From a cost perspective, the Terra and Luna models are equally significant. Terra offers GPT-5.5-level performance at half the price, substantially lowering the barrier for serious agent development.

While in the preview stage, its real-world performance upon general availability will be closely watched.

Comments (0)

Share:XHatena

Post a Comment

Loading...