Back to Models
MistralAIOpen Source

Voxtral-Small-24B-2507

Voxtral-Small-24B-2507 is a speech-specialized foundation model developed by MistralAI. It has a parameter scale of approximately 240B and supports a context window of 32K.

Parameters

240.0B

Context Window

32K

License

Apache 2.0

Release Date

2025-07-15

API Pricing

API pricing for this model is not yet available

Strengths

  • Large parameter count
  • Specialization in audio processing
  • Open-source Apache 2.0 license

Weaknesses

  • Very large model file size
  • Requires significant computational resources
  • Context length is medium-scale

Use Cases

  • Advanced speech recognition
  • Audio data analysis
  • Building voice-based AI systems

Deep Analysis

Architecture

Multimodal Audio Chat (24B)

Based on Mistral Small 24B backbone

Context Window

32K tokens

Up to 40 min for understanding

Release Date

July 15, 2025

License

Apache 2.0

Modalities

Audio + Text

Speech understanding and transcription

Languages

8+ languages

Multilingual with auto-detection

Strengths

  • Production-scale speech understanding
  • Apache 2.0 open-source
  • 40 min audio understanding capability
  • Function calling from voice
  • Native multilingual support
  • Retains text understanding of Mistral Small 3.1

Weaknesses

  • Larger model requires more compute
  • 32K context window
  • No vision modality

Competitor Comparison

ModelArenaSWEGPQAPrice
Voxtral Mini 3B---Lower
GPT-4o Audio---Higher
Google Gemini Audio---Comparable

Voxtral Small 24B is Mistral's production-scale open-source speech understanding model. Released July 2025 under Apache 2.0, it handles up to 40 minutes of audio for understanding tasks with built-in Q&A, summarization, and function calling from voice.

Analysis generated: 2026-05-24