What are the strengths of this model?

Over 1.2 trillion parameters for vast capacity 262K extremely long context understanding Achieves advanced reasoning capabilities

What are the weaknesses of this model?

Closed-source license model High computational cost from massive model size License with usage restrictions

What are the best use cases?

Complex logical reasoning tasks Analysis of ultra-long documents Advanced knowledge-intensive tasks

Back to Models

AlibabaProprietary

Qwen3.5-122B-A10B

Name: Qwen3.5-122B-A10B
Author: Alibaba

Qwen3.5-122B-A10B is a reasoning model developed by Alibaba. It features a large-scale parameter count of approximately 1220.0B and a long context window of 262K.

Parameters

1220.0B

Context Window

262K

License

https://huggingface.co/Qwen/Qwen2.5-72B/blob/main/LICENSE

Release Date

2026-02-25

API Pricing

API pricing for this model is not yet available

Strengths

・Over 1.2 trillion parameters for vast capacity
・262K extremely long context understanding
・Achieves advanced reasoning capabilities

Weaknesses

・Closed-source license model
・High computational cost from massive model size
・License with usage restrictions

Use Cases

・Complex logical reasoning tasks
・Analysis of ultra-long documents
・Advanced knowledge-intensive tasks

Deep Analysis

Release Date

February 2026

Total Parameters

122B

MoE architecture

Active Parameters

10B per token

256 experts with selective activation

Context Window

262,144 tokens

Architecture

Hybrid MoE: Gated DeltaNet + Gated Attention

Modalities

Text, Image, Video

VRAM (Q4)

~70 GB

VRAM (BF16)

~244 GB

License

Apache 2.0

API Price

Available via DashScope, SiliconFlow, DeepInfra

Strengths

・Strong quality-to-compute ratio: 10B active parameters deliver near-frontier performance
・Natively multimodal with text, image, and video support from the same weights
・Fits on multi-GPU consumer setups at Q4 (~70GB VRAM) — accessible for serious enthusiasts
・262K native context with hybrid DeltaNet architecture for fast long-context inference
・Apache 2.0 license enables commercial use and fine-tuning

Weaknesses

・Positioned awkwardly between the 397B flagship and the 35B-A3B speed model
・70GB VRAM at Q4 still requires multi-GPU setup (2x RTX 4090 or better)
・No dedicated benchmark spotlight — overshadowed by the 397B and 9B in marketing
・Active parameters (10B) may be insufficient for the most demanding reasoning tasks
・Community adoption has been slower compared to the 9B, 27B, and 35B-A3B

Competitor Comparison

Model	Arena	SWE	GPQA	Price
Qwen3.5-397B-A17B	~1450	76.4	88.4	$0.40/$2.40
Qwen3.5-27B	~1400	~68	85.5	Open-source
Llama 4 Scout	~1380	~65	~80	Open-source
Qwen3.5-122B-A10B	~1420	~72	~86	Open-source
DeepSeek V3	~1410	~70	~82	$0.14/$0.28

Overview

Qwen3.5-122B-A10B is the mid-tier MoE model in the Qwen3.5 family, with 122B total parameters and 10B active per token. It provides a strong balance between quality and inference efficiency, fitting on multi-GPU setups at ~70GB VRAM in Q4 quantization. Released under Apache 2.0 in February 2026, it is natively multimodal and benefits from the same hybrid DeltaNet architecture as its larger sibling.

Benchmarks & Performance

Benchmark data from the Qwen3.5 family shows the 122B-A10B positioned between the 397B and 27B dense model. On MMLU-Pro it scores in the ~85 range, GPQA Diamond ~86, with strong multimodal performance from the shared vision encoder. It excels on instruction following and document understanding tasks. The MoE design with 256 experts and 10 active provides strong specialization capacity while keeping compute costs reasonable.

Detailed Comparison

Occupies the middle ground in the Qwen3.5 lineup. Delivers roughly 90-95% of the 397B's quality at a fraction of the hardware cost. Outperforms the 27B dense model on most benchmarks while requiring similar or less VRAM at Q4 quantization (70GB vs 16GB for 27B, but significantly more capable). Compared to Llama 4 Scout and DeepSeek V3, it offers competitive quality with native multimodal support.

Community Feedback

The 122B-A10B has received less community attention than the headline 397B or the value-star 9B/35B-A3B models. It is valued by users with multi-GPU setups who want strong quality without the extreme VRAM demands of the 397B. SiliconFlow and DeepInfra offer hosted API access, making it accessible without local hardware.

Use Cases

Best for teams and researchers with multi-GPU servers (2-4 GPUs) who need strong general-purpose AI with multimodal capabilities. Good balance for production workloads where the 397B is too expensive to host but the 27B is insufficient. Suitable for document analysis, image understanding, coding assistance, and long-context processing. API access through DashScope and SiliconFlow makes it viable for users without local hardware.

Latest News

Released February 2026 as part of the Qwen3.5 family. Available on HuggingFace, SiliconFlow, DeepInfra, and NVIDIA NIM. Microsoft Azure AI Foundry support added. Superseded in attention by Qwen3.6 release in April 2026.

Sources

Analysis generated: 2026-05-24