ELYZA-Thinking-1.0-Qwen-32B
Japan's first inference-specialized model developed by ELYZA. It adopts a "Chain-of-Thought" approach similar to OpenAI's o1/o3 series and is specialized for complex inference tasks.
파라미터
32B
컨텍스트
32K
라이선스
ELYZA License
출시일
2026-01-15
일본어 처리 능력
Model developed by a Japanese company or specialized for Japanese. Highest Japanese understanding and generation capability.
API 가격
입력 가격 (1M 토큰당)
¥80
출력 가격 (1M 토큰당)
¥320
과금 모드: standard
강점
- ・Japan's first reasoning-specialized model
- ・Strong in math and logical reasoning
- ・Visualizes thinking process in Japanese
- ・Stable performance based on Qwen-32B
약점
- ・Inference takes time
- ・Context length is 32K
- ・Not suitable for standard text generation
- ・Commercial use requires license confirmation
활용 사례
- ・Mathematical reasoning tasks
- ・Problem-solving requiring logical thinking
- ・Complex analysis in Japanese
- ・Use in the education field
심층 분석
Parameters
32B
lightweight open-weight model
Context Window
128K tokens
131072 tokens
MATH-500 (English)
80.8%
vs o1-mini: 80.0%
MATH-500 (Japanese)
78.6%
vs o1-mini: 77.2%
JMMLU_small
73.1%
Japanese knowledge benchmark
License
Apache 2.0
commercial use allowed
강점
- ・Strong mathematical reasoning in both Japanese and English, surpassing o1-mini on key benchmarks.
- ・Lightweight (32B parameters) yet competitive with much larger reasoning models.
- ・Fully open-source with permissive Apache 2.0 license for commercial use.
약점
- ・Coding performance (JHumanEval) regressed compared to its base model and lags behind competitors.
- ・Reasoning-focused training slightly reduced performance on some general Japanese language tasks.
- ・Requires substantial VRAM (~66GB) for full precision inference, limiting accessibility.
경쟁사 비교
| Model | Arena | SWE | GPQA | Price |
|---|---|---|---|---|
| OpenAI o1-mini | N/A | N/A | N/A | API-only (premium) |
| DeepSeek-R1-Distill-Qwen-32B | N/A | N/A | N/A | Open-source |
| QwQ-32B | N/A | N/A | N/A | Open-source |
ELYZA-Thinking-1.0-Qwen-32B is Japan's first specialized reasoning model, developed by ELYZA. It uses a Chain-of-Thought (CoT) approach, similar to OpenAI's o1 series, to tackle complex logical and mathematical problems. The model is built upon Alibaba's Qwen2.5-32B-Instruct and was fine-tuned on approximately 150,000 high-quality synthetic datasets generated using an innovative Monte Carlo Tree Search (MCTS)-based algorithm for optimal reasoning path exploration. This process enables the 32-billion parameter model to achieve performance comparable to OpenAI's o1-mini on key reasoning benchmarks, while remaining open-weight under the permissive Apache 2.0 license.
A key innovation is the dual-model approach: alongside the primary reasoning model, ELYZA released "Shortcut Models" (32B and 7B variants) trained on the same problem sets but without the lengthy reasoning process. The Shortcut model achieves performance comparable to GPT-4o on general tasks, demonstrating how complex reasoning capabilities developed during training can be distilled into faster, direct-response models. This work highlights a growing trend in AI development where heavy computational costs are shifted from inference to the development phase to create powerful yet efficient models.
While excelling in mathematical and logical reasoning, the model shows a trade-off: its coding capabilities slightly regressed compared to its base model, indicating the specialized training data may have lacked sufficient coding tasks. Nevertheless, it represents a significant milestone for Japanese-language AI, providing a powerful, commercially-viable open-source reasoning model that advances the state of the art for specialized inference tasks.
출처
- ELYZA-Thinking-1.0: MCTS を用いた推論パス探索と模倣学習による Reasoning Model の開発
- ELYZA、論理的思考能力を強化した「Reasoning Model」を開発、商用利用可能な形で公開しました|株式会社ELYZA 公式ブログ
- ELYZA、論理的思考能力を強化した「Reasoning Model」を開発、商用利用可能な形式で公開 | 株式会社ELYZAのプレスリリース
- Run ELYZA-Thinking-1.0-Qwen-32B API | Serverless Inference | 32K Context | Flat-Rate Pricing - Featherless.ai
- ELYZA Thinking 1.0 Qwen 32B by elyza — VRAM 65.8GB, 128K context | LLM Explorer
분석 생성일: 2026-05-23