What are the strengths of this model?

Huge parameter scale of 397B Long context understanding up to 256K Highly versatile multimodal support

What are the weaknesses of this model?

Requires massive computational resources Potential for high inference costs Operational load due to model size

What are the best use cases?

Large-scale multimodal data analysis Processing ultra-long contexts Advanced general-purpose AI applications

Back to Models

AlibabaOpen Source

Qwen3.5-397B-A17B

Name: Qwen3.5-397B-A17B
Price: 0.5 USD
Author: Alibaba

Qwen3.5-397B-A17B is a multimodal foundation model developed by Alibaba. It has a parameter scale of approximately 397.0B and supports a long context window of up to 256K.

Parameters

397.0B

Context Window

256K

License

Apache 2.0

Release Date

2026-02-16

API Pricing

Input Price (per 1M tokens)

$0.5

Output Price (per 1M tokens)

Billing Mode: standard

Strengths

・Huge parameter scale of 397B
・Long context understanding up to 256K
・Highly versatile multimodal support

Weaknesses

・Requires massive computational resources
・Potential for high inference costs
・Operational load due to model size

Use Cases

・Large-scale multimodal data analysis
・Processing ultra-long contexts
・Advanced general-purpose AI applications

Deep Analysis

Release Date

February 16, 2026

Total Parameters

397B

Largest in Qwen3.5 family

Active Parameters

17B per token

MoE with 512 experts, 11 active

Context Window

262,144 tokens (native), up to 1M via YaRN

Architecture

Hybrid MoE: Gated DeltaNet + Gated Attention

Modalities

Text, Image, Video

Natively multimodal

Languages

201

License

Apache 2.0

API Price (Input)

$0.40/1M tokens

API Price (Output)

$2.40/1M tokens

Strengths

・Frontier-level open-weight model under Apache 2.0 — best reasoning among open models at launch
・Natively multimodal: text, image, and video from same weights with no separate VL variant needed
・Exceptional agentic performance: Terminal-Bench 52.5, BrowseComp 69.0, NOVA-63 59.1
・19x faster decoding at 256K tokens vs Qwen3-Max due to hybrid DeltaNet architecture
・Strong benchmarks: MMLU-Pro 87.8, GPQA Diamond 88.4, SWE-bench 76.4, AIME 2025 91.3

Weaknesses

・Massive hardware requirement: ~220GB VRAM at Q4, ~780GB at BF16 full precision
・HLE score of 28.7% indicates gaps in absolute expert-level factuality
・Trails Gemini 3 Pro on competitive coding (LiveCodeBench 83.6 vs 90.7)
・Occasionally hallucinates tool outputs in autonomous agent scenarios
・Terminal-Bench 52.5% still leaves room for improvement on complex CLI tasks

Competitor Comparison

Model	Arena	SWE	GPQA	Price
GPT-5.2	~1500	~80	92.4	Proprietary
Claude Opus	~1490	80.9	87.0	Proprietary
Gemini 3 Pro	~1480	76.2	91.9	Proprietary
Qwen3.5-397B-A17B	~1450	76.4	88.4	$0.40/$2.40
DeepSeek V3.2	~1430	73.1	82.4	Proprietary

Sources

Analysis generated: 2026-05-24