Back to Models
Zhipu AIProprietary

GLM-ASR-2512

GLM-ASR-2512 is a speech large model developed by Zhipu AI. It is provided as a closed-source model with advanced speech processing capabilities.

Parameters

Undisclosed

Context Window

License

Proprietary

Release Date

2025-12-10

API Pricing

API pricing for this model is not yet available

Strengths

  • Cutting-edge audio processing capabilities
  • Advanced design by Zhipu AI
  • Latest model architecture

Weaknesses

  • Non-open-source license
  • Opaque internal structure
  • Potential usage restrictions

Use Cases

  • Advanced speech recognition tasks
  • Analysis and processing of audio data
  • Development of next-generation audio AI

Deep Analysis

Model Type

Automatic Speech Recognition (ASR)

Parameters

1.5B (Nano variant)

CER

0.0717 (industry-leading)

Languages

17 (WER ≤ 20%)

Audio Duration Limit

≤ 30 seconds

File Size Limit

≤ 25 MB

Strengths

  • Industry-leading CER of 0.0717
  • Exceptional dialect support including Cantonese
  • Low-volume speech robustness (whisper/quiet speech)
  • Outperforms OpenAI Whisper V3 on multiple benchmarks
  • Efficient custom dictionary for specialized terminology

Weaknesses

  • 30-second audio duration limit per request
  • 25 MB file size limit
  • Primarily optimized for Chinese/English markets
  • Closed-source API (Nano variant is open-source)
  • May require multiple requests for long audio files

Competitor Comparison

ModelArenaSWEGPQAPrice
OpenAI Whisper V3 LargeN/AN/AN/A$0.006/min
Google Cloud Speech-to-Text V2N/AN/AN/A$0.016/min
Azure Speech to TextN/AN/AN/A$1/hour
AssemblyAI Universal-2N/AN/AN/A$0.015/min

GLM-ASR-2512 is Zhipu AI's next-generation speech recognition model achieving a character error rate of 0.0717, reaching internationally leading standards. It excels at Chinese, English, and Cantonese recognition with robust performance in noisy environments and low-volume speech scenarios. The API version supports real-time transcription for meetings, customer service, and document input.

Analysis generated: 2026-05-24