Back to Models
Zhipu AIOpen Source

GLM-Image

GLM-Image is a visual large model developed by Zhipu AI. It has a parameter scale of approximately 160B and is an open multimodal model released under the MIT license.

Parameters

160.0B

Context Window

4K

License

MIT

Release Date

2026-01-14

API Pricing

API pricing for this model is not yet available

Strengths

  • Large 160B parameter scale
  • Openness via MIT license
  • Advanced visual understanding capabilities

Weaknesses

  • Limited 4K context length
  • Large 35.8GB file size
  • High computational resource requirements

Use Cases

  • Advanced image analysis and understanding
  • Extraction and processing of visual information
  • Multimodal AI development

Deep Analysis

Architecture

Autoregressive (9B) + Diffusion (7B)

CVTG-2K Word Accuracy

0.9116

#1 open-source

LongText-Bench EN

0.9524

#1 open-source

LongText-Bench CN

0.9788

#1 open-source

Price

$0.015 per image

License

Apache 2.0

Release Date

January 9, 2026

Strengths

  • Open-source SOTA text rendering (#1 CVTG-2K, LongText-Bench)
  • Hybrid architecture combines semantics + detail
  • Excels at knowledge-intensive generation
  • Very affordable ($0.015/image)

Weaknesses

  • General quality matches but doesn't surpass mainstream models
  • Max 2048px resolution
  • Smaller community (912 GitHub stars)

Competitor Comparison

ModelPrice
DALL-E 3$0.04-$0.08/image
Midjourney v6Subscription
Stable Diffusion 3Free (self-host)

GLM-Image is a hybrid autoregressive+diffusion image generator. #1 open-source in text rendering accuracy. $0.015/image, Apache 2.0.

Sources

Analysis generated: 2026-05-24