What are the strengths of this model?

Sufficient parameter scale of 20.0B Open usage under Apache 2.0 license Efficient model file size

What are the weaknesses of this model?

Details of specialized functions are unknown Lack of specific operational cost metrics No information on multilingual support range

What are the best use cases?

Building advanced speech recognition systems Text conversion processing of audio data Open-source audio AI development

Back to Models

Zhipu AIOpen Source

GLM-ASR-Nano-2512

Name: GLM-ASR-Nano-2512
Author: Zhipu AI

GLM-ASR-Nano-2512 is a speech large model developed by Zhipu AI. It has approximately 20.0B parameters and is released under the Apache 2.0 license.

Parameters

20.0B

Context Window

License

Apache 2.0

Release Date

2025-12-10

API Pricing

API pricing for this model is not yet available

Strengths

・Sufficient parameter scale of 20.0B
・Open usage under Apache 2.0 license
・Efficient model file size

Weaknesses

・Details of specialized functions are unknown
・Lack of specific operational cost metrics
・No information on multilingual support range

Use Cases

・Building advanced speech recognition systems
・Text conversion processing of audio data
・Open-source audio AI development

Deep Analysis

Model Type

Automatic Speech Recognition (ASR)

Parameters

1.5B

Average Error Rate

4.10 (lowest among comparable models)

Languages

17 (WER ≤ 20%)

License

Apache 2.0

GitHub Stars

806

Strengths

・Open-source with Apache 2.0 license
・Compact 1.5B parameter model suitable for edge deployment
・Outperforms Whisper V3 on Chinese benchmarks
・Exceptional Cantonese and dialect recognition
・Low-volume speech robustness for quiet environments

Weaknesses

・1.5B parameters still require significant compute for edge devices
・Primarily optimized for Chinese language family
・English performance may lag behind specialized English models
・Requires transformers 5.0.0 from source for best results
・Model weight format changed after December 27, 2025

Competitor Comparison

Model	Arena	SWE	GPQA	Price
OpenAI Whisper V3 Large	N/A	N/A	N/A	Open source
Whisper V3 Small	N/A	N/A	N/A	Open source
Moonshine ASR	N/A	N/A	N/A	Open source
NVIDIA Canary 1B	N/A	N/A	N/A	Open source

Overview

GLM-ASR-Nano-2512 is Zhipu AI's open-source speech recognition model with 1.5B parameters, achieving the lowest average error rate (4.10) among comparable open-source models. Released under Apache 2.0, it excels at Chinese, English, and Cantonese recognition with unique low-volume speech robustness. Available on Hugging Face and ModelScope.

Benchmarks & Performance

Lowest average error rate of 4.10 among comparable open-source models. Significant advantages on Chinese benchmarks (Wenet Meeting, Aishell-1). Outperforms OpenAI Whisper V3 on multiple benchmarks while maintaining compact 1.5B size. Designed for real-world complexity with noise and overlapping speech.

Detailed Comparison

Directly competes with Whisper V3 Large (1.5B vs 1.5B). Better on Chinese and dialect benchmarks, comparable on English. More compact than NVIDIA Canary 1B. Key advantage is dialect support and low-volume speech handling. Trade-off is English-centric performance gap.

Community Feedback

806 GitHub stars with active development. Community appreciates open-source availability and Apache 2.0 license. Used in AutoGLM and Zhipu AI Input Method products. Developers note the model weight format change as a migration concern.

Use Cases

Ideal for on-device speech recognition, Chinese language applications, dialect-aware transcription, quiet environment recording, and edge AI deployments. The open-source nature enables customization and fine-tuning for specific domains. Best for Chinese-focused applications where compact model size matters.

Latest News

Released December 2025. Model weights updated December 27, 2025. Integrated into Hugging Face transformers library. Last GitHub activity March 6, 2026.

Sources

Analysis generated: 2026-05-24