Back to Models
Zhipu AIOpen Source

GLM-4.6V 106B-A12B

GLM-4.6V 106B-A12B is a multimodal foundation model developed by Zhipu AI. It is equipped with a 128K context window and is released under the MIT license.

Parameters

1080.0B

Context Window

128K

License

MIT

Release Date

2025-12-08

API Pricing

API pricing for this model is not yet available

Strengths

  • Powerful multimodal processing abilities
  • Wide context window of 128K tokens
  • Openness due to MIT license

Weaknesses

  • Load from vast parameters
  • High necessity for computational resources
  • Inference cost associated with model scale

Use Cases

  • Advanced image and text analysis
  • Understanding of long documents
  • Multimodal AI development

Deep Analysis

Parameters

108B total / 12B active MoE

Context Window

128K tokens

AA Intelligence Index

23

Pricing

$0.30/$0.90 per 1M tokens

License

MIT

Native Function Call

Yes (first GLM VLM)

Release Date

December 2025

Strengths

  • First GLM VLM with native multimodal function calling
  • 128K context for 150+ pages or 1hr video
  • MIT license
  • Competitive pricing ($0.30/$0.90)

Weaknesses

  • Slow output (36.7 tok/s)
  • Verbose output (90M tokens for eval)
  • AA Index 23 moderate
  • Superseded by GLM-5V Turbo

Competitor Comparison

ModelArena
Gemini 3 Pro~1449
Qwen3-VL-235BN/A
GLM-4.5V$0.60/$1.80

GLM-4.6V is a multimodal VLM with 108B/12B active params, 128K context, native multimodal function calling. MIT license.

Sources

Analysis generated: 2026-05-24