Back to Models
Zhipu AIOpen Source
GLM-OCR
GLM-OCR is a visual large model developed by Zhipu AI. With approximately 9 billion parameters, it is an open multimodal model released under the Apache 2.0 license.
Parameters
9.0B
Context Window
8K
License
Apache 2.0
Release Date
2026-02-03
API Pricing
API pricing for this model is not yet available
Strengths
- ・Design specialized for visual recognition
- ・Sufficient 9 billion scale parameters
- ・Available under open license
Weaknesses
- ・Limited 8K context length
- ・Versatility unknown due to specialization
- ・Requires consistent memory consumption
Use Cases
- ・Advanced in-image text recognition
- ・Digitizing visual information
- ・Automating document analysis
Deep Analysis
Parameters
0.9B (900M)
OmniDocBench v1.5
94.62
#1 overall
OCRBench
94.0
Throughput
1.86 pages/sec (PDF)
Pricing
~$0.03/1M tokens
Languages
8 languages
Release Date
March 2025
Strengths
- ・SOTA OmniDocBench v1.5 (94.62) with only 0.9B params
- ・1/10 cost of traditional OCR
- ・Multi-Token Prediction ~50% throughput boost
- ・8 languages, edge-deployable
Weaknesses
- ・Very specialized
- ・No video/interactive tasks
- ・KIE trails Gemini-3-Pro
Competitor Comparison
| Model | Price |
|---|---|
| PaddleOCR-VL-1.5 | N/A |
| MinerU2.5 | N/A |
| DeepSeek-OCR | N/A |
GLM-OCR is a 0.9B model achieving SOTA on OmniDocBench v1.5 (94.62), surpassing 235B models. ~$0.03/1M tokens, 8 languages.
Sources
Analysis generated: 2026-05-24