このモデルの強みは何ですか？

300億パラメータの規模オープンライセンスでの提供視覚情報の高度な処理能力

このモデルの弱みは何ですか？

特定タスクへの特化運用時の計算リソースが必要汎用的な対話能力の不足

どんな用途に最適ですか？

画像からの文字認識視覚データの分析ドキュメントのデジタル化

モデル一覧に戻る

DeepSeekオープンソース

DeepSeek-OCR

Name: DeepSeek-OCR
Author: DeepSeek

DeepSeek-OCRは、DeepSeek-AIによって開発された視覚大模型です。約30.0Bのパラメータ規模を持ち、MITライセンスの下で公開されています。

パラメータ

30.0B

コンテキスト長

ライセンス

MIT

リリース日

2025-10-20

API料金

このモデルのAPI料金情報は現在未公開です

強み

・300億パラメータの規模
・オープンライセンスでの提供
・視覚情報の高度な処理能力

弱み

・特定タスクへの特化
・運用時の計算リソースが必要
・汎用的な対話能力の不足

活用例

・画像からの文字認識
・視覚データの分析
・ドキュメントのデジタル化

深度分析

OCR Precision

97% at <10x compression

Vision Tokens

64-1853 per page

Production Speed

200k+ pages/day (single A100)

言語

~100

License

Apache 2.0

Release Date

October 20, 2025

強み

・Revolutionary compression (97% at 10x)
・200k+ pages/day on single GPU
・~100 language support
・Deep parsing (charts, formulas)

弱み

・Not a general VLM
・Degrades at 20x compression
・No SFT stage (not a chatbot)

競合比較

Model
GOT-OCR2.0
MinerU2.0

概要

DeepSeek-OCR pioneers optical compression: 97% precision at 10x compression. 200k+ pages/day on single A100, ~100 languages.

ベンチマーク＆性能

SOTA on OmniDocBench with fewest vision tokens. 60x more efficient than MinerU2.0.

詳細比較

Unique value is extreme token efficiency for large-scale processing.

コミュニティ評価

Highly impressed by compression ratios. Novel research direction.

ユースケース

Online OCR for LLMs, batch PDF processing for pretraining data.