Back to Models
Zhipu AIOpen Source

GLM-OCR

GLM-OCR is a visual large model developed by Zhipu AI. With approximately 9 billion parameters, it is an open multimodal model released under the Apache 2.0 license.

Parameters

9.0B

Context Window

8K

License

Apache 2.0

Release Date

2026-02-03

API Pricing

API pricing for this model is not yet available

Strengths

  • Design specialized for visual recognition
  • Sufficient 9 billion scale parameters
  • Available under open license

Weaknesses

  • Limited 8K context length
  • Versatility unknown due to specialization
  • Requires consistent memory consumption

Use Cases

  • Advanced in-image text recognition
  • Digitizing visual information
  • Automating document analysis

Deep Analysis

Parameters

0.9B (900M)

OmniDocBench v1.5

94.62

#1 overall

OCRBench

94.0

Throughput

1.86 pages/sec (PDF)

Pricing

~$0.03/1M tokens

Languages

8 languages

Release Date

March 2025

Strengths

  • SOTA OmniDocBench v1.5 (94.62) with only 0.9B params
  • 1/10 cost of traditional OCR
  • Multi-Token Prediction ~50% throughput boost
  • 8 languages, edge-deployable

Weaknesses

  • Very specialized
  • No video/interactive tasks
  • KIE trails Gemini-3-Pro

Competitor Comparison

ModelPrice
PaddleOCR-VL-1.5N/A
MinerU2.5N/A
DeepSeek-OCRN/A

GLM-OCR is a 0.9B model achieving SOTA on OmniDocBench v1.5 (94.62), surpassing 235B models. ~$0.03/1M tokens, 8 languages.

Sources

Analysis generated: 2026-05-24