Back to Models
MistralAIOpen Source

Voxtral-Mini-3B-2507

Voxtral-Mini-3B-2507 is a speech-specialized foundation model developed by MistralAI. It has a parameter scale of 30.0B and supports a maximum context length of 32K.

Parameters

30.0B

Context Window

32K

License

Apache 2.0

Release Date

2025-07-15

API Pricing

API pricing for this model is not yet available

Strengths

  • Specialized design for audio processing
  • Wide 32K context length
  • Open-source Apache 2.0 license

Weaknesses

  • High computational needs vs. smaller models
  • Performance gap with text-only models
  • Significant memory consumption from model size

Use Cases

  • Advanced audio data analysis
  • Contextual understanding of long audio clips
  • Open-source-based voice development

Deep Analysis

Architecture

Multimodal Audio Chat (3B)

Based on Mistral Small 3.1 backbone

Context Window

32K tokens

Up to 30 min transcription

Release Date

July 15, 2025

License

Apache 2.0

Modalities

Audio + Text

Speech understanding and transcription

Languages

8+ languages

EN, FR, DE, ES, IT, PT, NL, HI

Strengths

  • Open-source speech understanding model
  • Apache 2.0 license
  • Multilingual with automatic language detection
  • Function calling from voice input
  • Lightweight 3B for edge deployment
  • Cost-effective transcription

Weaknesses

  • 32K context limits long audio processing
  • Smaller model may miss nuances
  • No image/video modality

Competitor Comparison

ModelArenaSWEGPQAPrice
Voxtral Small 24B---Higher
OpenAI Whisper---Comparable
GPT-4o Audio---Higher

Voxtral Mini 3B is Mistral's lightweight open-source speech understanding model. Released July 2025 under Apache 2.0, it offers transcription, Q&A, summarization, and function calling from voice at less than half the price of comparable APIs.

Analysis generated: 2026-05-24