Back to Models
Zhipu AIOpen Source
GLM-Image
GLM-Image is a visual large model developed by Zhipu AI. It has a parameter scale of approximately 160B and is an open multimodal model released under the MIT license.
Parameters
160.0B
Context Window
4K
License
MIT
Release Date
2026-01-14
API Pricing
API pricing for this model is not yet available
Strengths
- ・Large 160B parameter scale
- ・Openness via MIT license
- ・Advanced visual understanding capabilities
Weaknesses
- ・Limited 4K context length
- ・Large 35.8GB file size
- ・High computational resource requirements
Use Cases
- ・Advanced image analysis and understanding
- ・Extraction and processing of visual information
- ・Multimodal AI development
Deep Analysis
Architecture
Autoregressive (9B) + Diffusion (7B)
CVTG-2K Word Accuracy
0.9116
#1 open-source
LongText-Bench EN
0.9524
#1 open-source
LongText-Bench CN
0.9788
#1 open-source
Price
$0.015 per image
License
Apache 2.0
Release Date
January 9, 2026
Strengths
- ・Open-source SOTA text rendering (#1 CVTG-2K, LongText-Bench)
- ・Hybrid architecture combines semantics + detail
- ・Excels at knowledge-intensive generation
- ・Very affordable ($0.015/image)
Weaknesses
- ・General quality matches but doesn't surpass mainstream models
- ・Max 2048px resolution
- ・Smaller community (912 GitHub stars)
Competitor Comparison
| Model | Price |
|---|---|
| DALL-E 3 | $0.04-$0.08/image |
| Midjourney v6 | Subscription |
| Stable Diffusion 3 | Free (self-host) |
GLM-Image is a hybrid autoregressive+diffusion image generator. #1 open-source in text rendering accuracy. $0.015/image, Apache 2.0.
Sources
Analysis generated: 2026-05-24