모델 목록으로
DeepSeek오픈소스

DeepSeek V4 Flash

DeepSeek V4 Flash는 DeepSeek-AI가 개발한 추론 모델입니다. 약 2840B에 달하는 방대한 파라미터와 1M 토큰의 긴 컨텍스트 윈도우를 가지고 있습니다.

파라미터

2840.0B

컨텍스트

1M

라이선스

MIT

출시일

2026-04-24

API 가격

입력 가격 (1M 토큰당)

$0.14

출력 가격 (1M 토큰당)

$0.28

과금 모드: standard

강점

  • 방대한 2840B 파라미터
  • 1M 토큰의 긴 컨텍스트 이해
  • MIT 라이선스를 통한 높은 자유도

약점

  • 추론 시 높은 계산 부담
  • 대형 모델 사이즈로 인한 운영 비용 증가
  • 운영 시 높은 메모리 소비

활용 사례

  • 대규모 문서의 고급 분석
  • 복잡한 논리적 추론이 필요한 작업
  • 긴 컨텍스트 기능을 활용한 개발

심층 분석

Total Parameters

284B

13B activated per token

Context Window

1M tokens

8x increase from V3.2's 128K

Input Price (cache miss)

$0.14/1M tokens

Cache hit: $0.028/1M

Output Price

$0.28/1M tokens

SWE-Bench Verified

79%

vs DeepSeek V4 Pro: 80.6%

GPQA Diamond

88.1%

From official evaluations

강점

  • Extremely cost-effective with strong coding/agentic performance relative to price
  • 1M-token context window with efficient hybrid attention architecture (CSA + HCA)
  • Open-weight under MIT license with multiple inference providers available

약점

  • Weaker on complex agentic tasks and factual knowledge compared to DeepSeek V4 Pro
  • High hallucination rate (96%) when uncertain, as reported by Artificial Analysis
  • Preview model: behavior and pricing may change before final release

경쟁사 비교

ModelArenaSWEGPQAPrice
DeepSeek V4 Pro146080.6%90.1%$0.435/$0.87 (input/output per 1M tokens)
Kimi K2.6145480.2%90.5%Not publicly listed
Claude Sonnet 4.6145979.6%89.9%Not publicly listed

DeepSeek V4 Flash represents a major leap in cost-efficient, long-context AI models. Released as part of the V4 preview on April 24, 2026, it is a 284B-parameter Mixture-of-Experts model with only 13B parameters activated per token, achieving a 1M-token context window through innovative hybrid attention mechanisms (Compressed Sparse Attention and Heavily Compressed Attention). This architecture enables dramatically reduced compute requirements—only 10% of the FLOPs and 7% of the KV cache compared to DeepSeek V3.2 at 1M context—making million-token inference practical.

Positioned as the efficiency-focused sibling to the larger DeepSeek V4 Pro (1.6T parameters), Flash is designed for high-volume, cost-sensitive applications like coding agents, long-document analysis, and tool-calling pipelines. While it trails Pro on complex reasoning and factual knowledge tasks, it delivers remarkably strong performance on coding benchmarks (SWE-Bench: 79%, LiveCodeBench: 91.6%) and multi-turn agentic workflows at a fraction of the cost.

The model is available under an MIT license via multiple providers (Novita, Fireworks AI, DeepInfra, Featherless AI) with API pricing starting at $0.14/1M input tokens and $0.28/1M output tokens. This makes it one of the most affordable frontier-tier models available, particularly attractive for teams looking to scale AI workloads without incurring the costs associated with premium models from OpenAI or Anthropic.

분석 생성일: 2026-05-23