ERNIE-4.5-VL-28B-A3B-Thinking Multimodal Reasoning Model
The ERNIE-4.5-VL-28B-A3B-Thinking Multimodal Reasoning Model is a multimodal inference model developed by Baidu. It features a parameter scale of approximately 280B and a 131K context window, offering advanced reasoning capabilities.
Parameters
280.0B
Context Window
131K
License
Apache 2.0
Release Date
2025-11-11
API Pricing
API pricing for this model is not yet available
Strengths
- ・Strong multimodal reasoning capability
- ・Large 280B parameter scale
- ・Long 131K context support
Weaknesses
- ・High computational resource consumption due to size
- ・Potential speed concerns as reasoning-specialized
- ・May require optimization for specific tasks
Use Cases
- ・Logical analysis of complex visual information
- ・Multimodal analysis of long documents
- ・Advanced visual problem-solving requiring reasoning
Deep Analysis
Architecture
MoE (28B total, 3B active)
Multimodal reasoning VLM
License
Apache 2.0
Open-source
Modalities
Text + Image
Vision-language model
Release Date
2025
Part of ERNIE 4.5 family
Framework
PaddlePaddle + Transformers
Strengths
- ・Apache 2.0 open-source license
- ・Lightweight 3B active parameters
- ・Multimodal reasoning with thinking capability
- ・Part of comprehensive ERNIE 4.5 family
- ・Available on HuggingFace and AI Studio
Weaknesses
- ・Small active parameter count limits complex tasks
- ・Chinese model ecosystem
- ・Limited benchmark data available
Competitor Comparison
| Model | Arena | SWE | GPQA | Price |
|---|---|---|---|---|
| ERNIE 4.5 (larger) | - | - | - | Higher |
| Qwen-VL | - | - | - | Comparable |
| InternVL2 | - | - | - | Comparable |
ERNIE-4.5-VL-28B-Thinking is Baidu's open-source multimodal reasoning model with 28B total and 3B active parameters. Part of the ERNIE 4.5 family, it combines vision and language understanding with thinking/reasoning capabilities under Apache 2.0 license.
Analysis generated: 2026-05-24