Back to Models
Google Deep MindProprietary

Gemini Embedding 2

Gemini Embedding 2 is an embedding model developed by Google DeepMind. With a context length of 8K, it assists in advanced vectorization of text representations.

Parameters

Undisclosed

Context Window

8K

License

Proprietary

Release Date

2026-03-10

API Pricing

API pricing for this model is not yet available

Strengths

  • Developed by Google DeepMind
  • Sufficient 8K context length
  • Efficient vector representation capability

Weaknesses

  • Non-open-source license
  • Closed model specifications
  • No external access to internal structure

Use Cases

  • Building semantic search
  • Determining document similarity
  • Constructing vector DB for RAG

Deep Analysis

Model Type

Multimodal Embedding

Input Token Limit

8,192

Output Dimensions

128-3072 (recommended: 768, 1536, 3072)

Supported Modalities

Text, Image, Video, Audio, PDF

Latest Update

April 2026

Languages

100+

Strengths

  • First natively multimodal embedding model from Google
  • State-of-the-art on MTEB Multilingual (69.9) and MTEB Code (84.0)
  • Flexible output dimensions via Matryoshka Representation Learning
  • Supports 100+ languages for cross-lingual tasks
  • Single unified embedding space for all modalities

Weaknesses

  • Higher cost than text-only embedding models
  • Requires Google API access (no open-source weights)
  • Large input processing may have latency for video/audio
  • Complex multimodal pipelines still need careful orchestration
  • Limited fine-tuning options for domain-specific use cases

Competitor Comparison

ModelArenaSWEGPQAPrice
Amazon Nova 2 MultimodalN/AN/AN/ACustom pricing
Voyage Multimodal 3.5N/AN/AN/A$0.12/1M tokens
OpenAI text-embedding-3-largeN/AN/AN/A$0.13/1M tokens
Cohere Embed v4N/AN/AN/A$0.10/1M tokens

Gemini Embedding 2 is Google's first natively multimodal embedding model, mapping text, images, video, audio, and PDFs into a single unified embedding space. Released in March 2026 and now generally available, it achieves state-of-the-art performance on cross-modal retrieval benchmarks while supporting 100+ languages.

Analysis generated: 2026-05-24