Qwen3.5-122B-A10B
Qwen3.5-122B-A10B는 알리바바가 개발한 추론 모델입니다. 약 1조 2200억 파라미터의 대규모 매개변수와 26만 2천 토큰의 긴 컨텍스트 윈도우를 특징으로 합니다.
파라미터
1220.0B
컨텍스트
262K
라이선스
https://huggingface.co/Qwen/Qwen2.5-72B/blob/main/LICENSE
출시일
2026-02-25
API 가격
이 모델의 API 가격 정보는 현재 공개되지 않았습니다
강점
- ・1조 2천억 이상의 파라미터로 방대한 용량
- ・262K 극장문 컨텍스트 이해
- ・고급 추론 능력 달성
약점
- ・소스 코드 비공개 라이선스 모델
- ・거대한 모델 규모로 인한 높은 연산 비용
- ・사용 제한이 있는 라이선스
활용 사례
- ・복잡한 논리적 추론 과제
- ・초장문 문서 분석
- ・고급 지식 집약적 과제
심층 분석
Release Date
February 2026
Total Parameters
122B
MoE architecture
Active Parameters
10B per token
256 experts with selective activation
Context Window
262,144 tokens
Architecture
Hybrid MoE: Gated DeltaNet + Gated Attention
Modalities
Text, Image, Video
VRAM (Q4)
~70 GB
VRAM (BF16)
~244 GB
License
Apache 2.0
API Price
Available via DashScope, SiliconFlow, DeepInfra
강점
- ・Strong quality-to-compute ratio: 10B active parameters deliver near-frontier performance
- ・Natively multimodal with text, image, and video support from the same weights
- ・Fits on multi-GPU consumer setups at Q4 (~70GB VRAM) — accessible for serious enthusiasts
- ・262K native context with hybrid DeltaNet architecture for fast long-context inference
- ・Apache 2.0 license enables commercial use and fine-tuning
약점
- ・Positioned awkwardly between the 397B flagship and the 35B-A3B speed model
- ・70GB VRAM at Q4 still requires multi-GPU setup (2x RTX 4090 or better)
- ・No dedicated benchmark spotlight — overshadowed by the 397B and 9B in marketing
- ・Active parameters (10B) may be insufficient for the most demanding reasoning tasks
- ・Community adoption has been slower compared to the 9B, 27B, and 35B-A3B
경쟁사 비교
| Model | Arena | SWE | GPQA | Price |
|---|---|---|---|---|
| Qwen3.5-397B-A17B | ~1450 | 76.4 | 88.4 | $0.40/$2.40 |
| Qwen3.5-27B | ~1400 | ~68 | 85.5 | Open-source |
| Llama 4 Scout | ~1380 | ~65 | ~80 | Open-source |
| Qwen3.5-122B-A10B | ~1420 | ~72 | ~86 | Open-source |
| DeepSeek V3 | ~1410 | ~70 | ~82 | $0.14/$0.28 |
Qwen3.5-122B-A10B is the mid-tier MoE model in the Qwen3.5 family, with 122B total parameters and 10B active per token. It provides a strong balance between quality and inference efficiency, fitting on multi-GPU setups at ~70GB VRAM in Q4 quantization. Released under Apache 2.0 in February 2026, it is natively multimodal and benefits from the same hybrid DeltaNet architecture as its larger sibling.
출처
분석 생성일: 2026-05-24