이 모델의 강점은 무엇인가요?

대용량 컨텍스트 지원 대화에 특화된 설계 MIT 라이선스 하에서 제공

이 모델의 약점은 무엇인가요?

약 18GB의 상대적으로 큰 모델 크기 연산 자원에 대한 의존성 특정 작업에 대한 최적화 부족

어떤 용도에 가장 적합한가요?

고급 AI 챗봇 구축 긴 문서 분석 오픈 소스 AI 개발

모델 목록으로

アリババ오픈소스

Qwen3.5-9B-Instruct

Name: Qwen3.5-9B-Instruct
Author: アリババ

Qwen3.5-9B-Instruct는 Alibaba에서 개발한 기초 모델입니다. 대화 중심 설계를 특징으로 하며, 128K의 긴 컨텍스트 윈도우를 지원합니다.

파라미터

90.0B

컨텍스트

128K

라이선스

MIT

출시일

2026-02-16

API 가격

이 모델의 API 가격 정보는 현재 공개되지 않았습니다

강점

・대용량 컨텍스트 지원
・대화에 특화된 설계
・MIT 라이선스 하에서 제공

약점

・약 18GB의 상대적으로 큰 모델 크기
・연산 자원에 대한 의존성
・특정 작업에 대한 최적화 부족

활용 사례

・고급 AI 챗봇 구축
・긴 문서 분석
・오픈 소스 AI 개발

심층 분석

Release Date

March 2, 2026

Parameters

Dense model — all parameters active

Architecture

Gated DeltaNet + Gated Attention (32 layers)

Context Window

262,144 tokens (native), ~1M via YaRN

Modalities

Text, Image, Video

VRAM (Q4)

~5-6 GB

VRAM (BF16)

~18 GB

Languages

201

License

Apache 2.0

GPQA Diamond

81.7

Beats Qwen3-30B (73.4) and Qwen3-80B (77.2)

강점

・Beats Qwen3-30B (3x its size) on GPQA Diamond (81.7 vs 73.4), IFEval (91.5 vs 88.9), LongBench v2 (55.2 vs 44.8)
・Dominates GPT-5-Nano on vision: MMMU-Pro +13, MathVision +17, OmniDocBench +32
・Runs on nearly any modern GPU: ~5GB at Q4, fits on RTX 3060 or M1 Mac
・Natively multimodal with video support from same weights — no separate VL variant
・Apache 2.0 license with thinking/non-thinking mode toggle

약점

・Coding benchmarks trail larger models: LiveCodeBench 65.6 vs GPT-OSS-120B's 82.7
・9B parameters inherently limited for the most complex multi-step reasoning
・Vision encoder quality degrades on low-resolution or heavily compressed images
・Community reports occasional instability with Ollama integration
・Not yet available as a major cloud API (primarily self-hosted)

경쟁사 비교

Model	Arena	SWE	GPQA	Price
GPT-5-Nano	~1350	~55	~78	Proprietary
Qwen3-30B	~1360	~58	73.4	Open-source
Qwen3.5-9B	~1370	~60	81.7	Open-source
Gemma 3 12B	~1350	~56	~75	Open-source
Llama 3.3 8B	~1340	~52	~70	Open-source

개요

Qwen3.5-9B is the standout model in the Qwen3.5 Small Series — a 9B dense model that punches absurdly above its weight. It beats the previous-generation Qwen3-30B (3x its size) on knowledge, reasoning, and long-context benchmarks, and dominates GPT-5-Nano on vision tasks by double-digit margins. With ~5GB VRAM at Q4, it runs on virtually any modern GPU including the RTX 3060 and M1 Mac.

벤치마크 및 성능

Exceptional for its size: MMLU-Pro 82.5, GPQA Diamond 81.7, IFEval 91.5, SuperGPQA 58.2, C-Eval 88.2, LongBench v2 55.2, AA-LCR 63.0. Vision: MMMU 78.4, MMMU-Pro 70.1 (vs GPT-5-Nano's 57.2), MathVision 78.9 (vs 62.2), OmniDocBench1.5 87.7 (vs 55.9), VideoMME 84.5 with subtitles. Beats Qwen3-80B on GPQA Diamond and IFEval despite being 9x smaller.

상세 비교

The headline comparison: beats GPT-OSS-120B on MMLU-Pro (82.5 vs 80.8), GPQA Diamond (81.7 vs 80.1), and MMMLU (81.2 vs 78.2) — a 13x size difference. However, GPT-OSS-120B wins on coding (LiveCodeBench 82.7 vs 65.6). Compared to the 27B, it trails by 4-7 points on benchmarks but runs at 3x less VRAM. Compared to Gemma 3 12B and Llama 3.3 8B, it is clearly superior on both knowledge and vision tasks.

커뮤니티 평가

Widely celebrated as one of the best small models available. The '9B beats 120B' narrative generated significant attention. AI researcher Nathan Lambert called Qwen 3.5 models 'legitimately fantastic.' Recommended as the best value model for local AI. Some users note the 4B is more popular for pure coding tasks. The vision capabilities are praised for practical tasks like document analysis and screenshot understanding.

활용 사례

The sweet-spot model for local AI users. Excellent for agentic coding on RTX 3060 (6GB VRAM at Q4), knowledge Q&A, document understanding, image analysis, video comprehension, long-context processing, and general-purpose assistant tasks. For coding-heavy workflows, consider the 4B (more stable) or 35B-A3B (faster). For creative writing, the 27B produces better prose. The 9B is the best all-around choice for users with limited hardware.