Qwen3.5-9B-Instruct
Qwen3.5-9B-Instruct is a foundation model developed by Alibaba. It features a chat-focused design and supports a long context window of 128K.
Parameters
90.0B
Context Window
128K
License
MIT
Release Date
2026-02-16
API Pricing
API pricing for this model is not yet available
Strengths
- ・Support for large context capacity
- ・Design specialized for dialogue
- ・Available under MIT license
Weaknesses
- ・Relatively large model size of approx. 18GB
- ・Dependence on computational resources
- ・Insufficient optimization for specific tasks
Use Cases
- ・Building advanced AI chatbots
- ・Analysis of long documents
- ・Open-source AI development
Deep Analysis
Release Date
March 2, 2026
Parameters
9B
Dense model — all parameters active
Architecture
Gated DeltaNet + Gated Attention (32 layers)
Context Window
262,144 tokens (native), ~1M via YaRN
Modalities
Text, Image, Video
VRAM (Q4)
~5-6 GB
VRAM (BF16)
~18 GB
Languages
201
License
Apache 2.0
GPQA Diamond
81.7
Beats Qwen3-30B (73.4) and Qwen3-80B (77.2)
Strengths
- ・Beats Qwen3-30B (3x its size) on GPQA Diamond (81.7 vs 73.4), IFEval (91.5 vs 88.9), LongBench v2 (55.2 vs 44.8)
- ・Dominates GPT-5-Nano on vision: MMMU-Pro +13, MathVision +17, OmniDocBench +32
- ・Runs on nearly any modern GPU: ~5GB at Q4, fits on RTX 3060 or M1 Mac
- ・Natively multimodal with video support from same weights — no separate VL variant
- ・Apache 2.0 license with thinking/non-thinking mode toggle
Weaknesses
- ・Coding benchmarks trail larger models: LiveCodeBench 65.6 vs GPT-OSS-120B's 82.7
- ・9B parameters inherently limited for the most complex multi-step reasoning
- ・Vision encoder quality degrades on low-resolution or heavily compressed images
- ・Community reports occasional instability with Ollama integration
- ・Not yet available as a major cloud API (primarily self-hosted)
Competitor Comparison
| Model | Arena | SWE | GPQA | Price |
|---|---|---|---|---|
| GPT-5-Nano | ~1350 | ~55 | ~78 | Proprietary |
| Qwen3-30B | ~1360 | ~58 | 73.4 | Open-source |
| Qwen3.5-9B | ~1370 | ~60 | 81.7 | Open-source |
| Gemma 3 12B | ~1350 | ~56 | ~75 | Open-source |
| Llama 3.3 8B | ~1340 | ~52 | ~70 | Open-source |
Qwen3.5-9B is the standout model in the Qwen3.5 Small Series — a 9B dense model that punches absurdly above its weight. It beats the previous-generation Qwen3-30B (3x its size) on knowledge, reasoning, and long-context benchmarks, and dominates GPT-5-Nano on vision tasks by double-digit margins. With ~5GB VRAM at Q4, it runs on virtually any modern GPU including the RTX 3060 and M1 Mac.
Sources
Analysis generated: 2026-05-24