Back to Models
AlibabaOpen Source

Qwen3.5-9B-Instruct

Qwen3.5-9B-Instruct is a foundation model developed by Alibaba. It features a chat-focused design and supports a long context window of 128K.

Parameters

90.0B

Context Window

128K

License

MIT

Release Date

2026-02-16

API Pricing

API pricing for this model is not yet available

Strengths

  • Support for large context capacity
  • Design specialized for dialogue
  • Available under MIT license

Weaknesses

  • Relatively large model size of approx. 18GB
  • Dependence on computational resources
  • Insufficient optimization for specific tasks

Use Cases

  • Building advanced AI chatbots
  • Analysis of long documents
  • Open-source AI development

Deep Analysis

Release Date

March 2, 2026

Parameters

9B

Dense model — all parameters active

Architecture

Gated DeltaNet + Gated Attention (32 layers)

Context Window

262,144 tokens (native), ~1M via YaRN

Modalities

Text, Image, Video

VRAM (Q4)

~5-6 GB

VRAM (BF16)

~18 GB

Languages

201

License

Apache 2.0

GPQA Diamond

81.7

Beats Qwen3-30B (73.4) and Qwen3-80B (77.2)

Strengths

  • Beats Qwen3-30B (3x its size) on GPQA Diamond (81.7 vs 73.4), IFEval (91.5 vs 88.9), LongBench v2 (55.2 vs 44.8)
  • Dominates GPT-5-Nano on vision: MMMU-Pro +13, MathVision +17, OmniDocBench +32
  • Runs on nearly any modern GPU: ~5GB at Q4, fits on RTX 3060 or M1 Mac
  • Natively multimodal with video support from same weights — no separate VL variant
  • Apache 2.0 license with thinking/non-thinking mode toggle

Weaknesses

  • Coding benchmarks trail larger models: LiveCodeBench 65.6 vs GPT-OSS-120B's 82.7
  • 9B parameters inherently limited for the most complex multi-step reasoning
  • Vision encoder quality degrades on low-resolution or heavily compressed images
  • Community reports occasional instability with Ollama integration
  • Not yet available as a major cloud API (primarily self-hosted)

Competitor Comparison

ModelArenaSWEGPQAPrice
GPT-5-Nano~1350~55~78Proprietary
Qwen3-30B~1360~5873.4Open-source
Qwen3.5-9B~1370~6081.7Open-source
Gemma 3 12B~1350~56~75Open-source
Llama 3.3 8B~1340~52~70Open-source

Qwen3.5-9B is the standout model in the Qwen3.5 Small Series — a 9B dense model that punches absurdly above its weight. It beats the previous-generation Qwen3-30B (3x its size) on knowledge, reasoning, and long-context benchmarks, and dominates GPT-5-Nano on vision tasks by double-digit margins. With ~5GB VRAM at Q4, it runs on virtually any modern GPU including the RTX 3060 and M1 Mac.

Analysis generated: 2026-05-24