이 모델의 강점은 무엇인가요?

20.0B의 충분한 파라미터 규모 Apache 2.0 라이선스 하의 공개된 사용 효율적인 모델 파일 크기

이 모델의 약점은 무엇인가요?

특화 기능에 대한 세부사항 미확인 구체적인 운영 비용 지표 부재 다국어 지원 범위에 대한 정보 없음

어떤 용도에 가장 적합한가요?

고급 음성 인식 시스템 구축 오디오 데이터의 텍스트 변환 처리 오픈소스 오디오 AI 개발

모델 목록으로

Zhipu AI오픈소스

GLM-ASR-Nano-2512

Name: GLM-ASR-Nano-2512
Author: Zhipu AI

GLM-ASR-Nano-2512는 Zhipu AI가 개발한 음성 대형 모델입니다. 약 20.0B 파라미터를 가지며 Apache 2.0 라이선스 하에 출시됩니다.

파라미터

20.0B

컨텍스트

라이선스

Apache 2.0

출시일

2025-12-10

API 가격

이 모델의 API 가격 정보는 현재 공개되지 않았습니다

강점

・20.0B의 충분한 파라미터 규모
・Apache 2.0 라이선스 하의 공개된 사용
・효율적인 모델 파일 크기

약점

・특화 기능에 대한 세부사항 미확인
・구체적인 운영 비용 지표 부재
・다국어 지원 범위에 대한 정보 없음

활용 사례

・고급 음성 인식 시스템 구축
・오디오 데이터의 텍스트 변환 처리
・오픈소스 오디오 AI 개발

심층 분석

Model Type

Automatic Speech Recognition (ASR)

Parameters

1.5B

Average Error Rate

4.10 (lowest among comparable models)

Languages

17 (WER ≤ 20%)

License

Apache 2.0

GitHub Stars

806

강점

・Open-source with Apache 2.0 license
・Compact 1.5B parameter model suitable for edge deployment
・Outperforms Whisper V3 on Chinese benchmarks
・Exceptional Cantonese and dialect recognition
・Low-volume speech robustness for quiet environments

약점

・1.5B parameters still require significant compute for edge devices
・Primarily optimized for Chinese language family
・English performance may lag behind specialized English models
・Requires transformers 5.0.0 from source for best results
・Model weight format changed after December 27, 2025

경쟁사 비교

Model	Arena	SWE	GPQA	Price
OpenAI Whisper V3 Large	N/A	N/A	N/A	Open source
Whisper V3 Small	N/A	N/A	N/A	Open source
Moonshine ASR	N/A	N/A	N/A	Open source
NVIDIA Canary 1B	N/A	N/A	N/A	Open source

개요

GLM-ASR-Nano-2512 is Zhipu AI's open-source speech recognition model with 1.5B parameters, achieving the lowest average error rate (4.10) among comparable open-source models. Released under Apache 2.0, it excels at Chinese, English, and Cantonese recognition with unique low-volume speech robustness. Available on Hugging Face and ModelScope.

벤치마크 및 성능

Lowest average error rate of 4.10 among comparable open-source models. Significant advantages on Chinese benchmarks (Wenet Meeting, Aishell-1). Outperforms OpenAI Whisper V3 on multiple benchmarks while maintaining compact 1.5B size. Designed for real-world complexity with noise and overlapping speech.

상세 비교

Directly competes with Whisper V3 Large (1.5B vs 1.5B). Better on Chinese and dialect benchmarks, comparable on English. More compact than NVIDIA Canary 1B. Key advantage is dialect support and low-volume speech handling. Trade-off is English-centric performance gap.

커뮤니티 평가

806 GitHub stars with active development. Community appreciates open-source availability and Apache 2.0 license. Used in AutoGLM and Zhipu AI Input Method products. Developers note the model weight format change as a migration concern.

활용 사례

Ideal for on-device speech recognition, Chinese language applications, dialect-aware transcription, quiet environment recording, and edge AI deployments. The open-source nature enables customization and fine-tuning for specific domains. Best for Chinese-focused applications where compact model size matters.