Qwen3.5-397B-A17B
Qwen3.5-397B-A17B is a multimodal foundation model developed by Alibaba. It has a parameter scale of approximately 397.0B and supports a long context window of up to 256K.
Parameters
397.0B
Context Window
256K
License
Apache 2.0
Release Date
2026-02-16
API Pricing
Input Price (per 1M tokens)
$0.5
Output Price (per 1M tokens)
$
Billing Mode: standard
Strengths
- ・Huge parameter scale of 397B
- ・Long context understanding up to 256K
- ・Highly versatile multimodal support
Weaknesses
- ・Requires massive computational resources
- ・Potential for high inference costs
- ・Operational load due to model size
Use Cases
- ・Large-scale multimodal data analysis
- ・Processing ultra-long contexts
- ・Advanced general-purpose AI applications
Deep Analysis
Release Date
February 16, 2026
Total Parameters
397B
Largest in Qwen3.5 family
Active Parameters
17B per token
MoE with 512 experts, 11 active
Context Window
262,144 tokens (native), up to 1M via YaRN
Architecture
Hybrid MoE: Gated DeltaNet + Gated Attention
Modalities
Text, Image, Video
Natively multimodal
Languages
201
License
Apache 2.0
API Price (Input)
$0.40/1M tokens
API Price (Output)
$2.40/1M tokens
Strengths
- ・Frontier-level open-weight model under Apache 2.0 — best reasoning among open models at launch
- ・Natively multimodal: text, image, and video from same weights with no separate VL variant needed
- ・Exceptional agentic performance: Terminal-Bench 52.5, BrowseComp 69.0, NOVA-63 59.1
- ・19x faster decoding at 256K tokens vs Qwen3-Max due to hybrid DeltaNet architecture
- ・Strong benchmarks: MMLU-Pro 87.8, GPQA Diamond 88.4, SWE-bench 76.4, AIME 2025 91.3
Weaknesses
- ・Massive hardware requirement: ~220GB VRAM at Q4, ~780GB at BF16 full precision
- ・HLE score of 28.7% indicates gaps in absolute expert-level factuality
- ・Trails Gemini 3 Pro on competitive coding (LiveCodeBench 83.6 vs 90.7)
- ・Occasionally hallucinates tool outputs in autonomous agent scenarios
- ・Terminal-Bench 52.5% still leaves room for improvement on complex CLI tasks
Competitor Comparison
| Model | Arena | SWE | GPQA | Price |
|---|---|---|---|---|
| GPT-5.2 | ~1500 | ~80 | 92.4 | Proprietary |
| Claude Opus | ~1490 | 80.9 | 87.0 | Proprietary |
| Gemini 3 Pro | ~1480 | 76.2 | 91.9 | Proprietary |
| Qwen3.5-397B-A17B | ~1450 | 76.4 | 88.4 | $0.40/$2.40 |
| DeepSeek V3.2 | ~1430 | 73.1 | 82.4 | Proprietary |
Sources
Analysis generated: 2026-05-24