Back to Models
Zhipu AIOpen Source
GLM-4.6V 106B-A12B
GLM-4.6V 106B-A12B is a multimodal foundation model developed by Zhipu AI. It is equipped with a 128K context window and is released under the MIT license.
Parameters
1080.0B
Context Window
128K
License
MIT
Release Date
2025-12-08
API Pricing
API pricing for this model is not yet available
Strengths
- ・Powerful multimodal processing abilities
- ・Wide context window of 128K tokens
- ・Openness due to MIT license
Weaknesses
- ・Load from vast parameters
- ・High necessity for computational resources
- ・Inference cost associated with model scale
Use Cases
- ・Advanced image and text analysis
- ・Understanding of long documents
- ・Multimodal AI development
Deep Analysis
Parameters
108B total / 12B active MoE
Context Window
128K tokens
AA Intelligence Index
23
Pricing
$0.30/$0.90 per 1M tokens
License
MIT
Native Function Call
Yes (first GLM VLM)
Release Date
December 2025
Strengths
- ・First GLM VLM with native multimodal function calling
- ・128K context for 150+ pages or 1hr video
- ・MIT license
- ・Competitive pricing ($0.30/$0.90)
Weaknesses
- ・Slow output (36.7 tok/s)
- ・Verbose output (90M tokens for eval)
- ・AA Index 23 moderate
- ・Superseded by GLM-5V Turbo
Competitor Comparison
| Model | Arena |
|---|---|
| Gemini 3 Pro | ~1449 |
| Qwen3-VL-235B | N/A |
| GLM-4.5V | $0.60/$1.80 |
GLM-4.6V is a multimodal VLM with 108B/12B active params, 128K context, native multimodal function calling. MIT license.
Sources
Analysis generated: 2026-05-24