모델 목록으로
Sakana AI조건부 오픈

Llama-3-Namazu-405B

A large-parameter version of the Namazu model developed by Sakana AI. Based on Llama-3-405B, it has undergone post-training optimized for Japanese cultural and social contexts.

파라미터

405B

컨텍스트

128K

라이선스

Llama 3 License

출시일

2026-03-15

일본어 처리 능력

🇯🇵Native JP

Model developed by a Japanese company or specialized for Japanese. Highest Japanese understanding and generation capability.

API 가격

이 모델의 API 가격 정보는 현재 공개되지 않았습니다

강점

  • High performance with 405B parameters
  • Optimized for Japanese cultural context
  • Built on the strong Llama 3 base model
  • High quality in long-text generation

약점

  • Requires high-spec GPUs to run
  • Non-commercial license (Llama 3 License)
  • High inference cost
  • No API availability

활용 사례

  • High-quality Japanese content generation
  • Complex Japanese language reasoning tasks
  • Research and development use
  • Processing large-scale Japanese data

심층 분석

MMLU (5-shot)

88.6%

Near parity with GPT-4o (~88.7%)

HumanEval (Coding)

89.0%

Competitive, behind Claude 3.5 Sonnet (92.0%)

Input Price

$2.40/M tokens

Via Amazon Standard provider

Output Price

$2.40/M tokens

Via Amazon Standard provider

Context Window

128K tokens

Llama 3.1 upgrade from 8K

Base Model Focus

Japanese Cultural Context

Post-training for neutrality and factual accuracy

강점

  • Frontier-level performance competitive with closed-source models on core benchmarks.
  • Optimized via post-training to reduce political/cultural bias and refusal rates for Japanese-context queries.
  • Open-weight model with extensive self-hosting and fine-tuning capabilities for data sovereignty.
  • Cost-effective API access available through multiple providers at significant discounts vs. GPT-4o.

약점

  • Massive computational requirements for self-hosting (200+ GB VRAM at INT4, requires multi-GPU setup).
  • Text-only input (no native vision or audio), unlike multimodal competitors like GPT-4o.
  • Knowledge cutoff from pre-training data (December 2023) may lag behind current events without RAG.
  • Namazu variant's specialized optimizations may have reduced general applicability outside targeted contexts.

경쟁사 비교

ModelArenaSWEGPQAPrice
Meta Llama 3.1 405B (Base)N/AN/A51.1%$2.40/$2.40
GPT-4oN/AN/A53.6%$2.50/$10.00
Claude 3.5 SonnetN/AN/A59.4%$3.00/$15.00

Llama-3-Namazu-405B is a specialized, large-parameter variant developed by Sakana AI, built upon Meta's Llama 3.1 405B architecture. It represents a focused post-training effort aimed at optimizing the model for Japanese cultural and social contexts. The primary innovation is not raw benchmark superiority—its core capabilities are derived from the powerful base model—but in its fine-tuning to correct inherent biases and reduce refusal rates on politically sensitive topics prevalent in models trained primarily on Western-centric data. As demonstrated in Sakana's internal benchmarks, Namazu drastically lowered refusal rates for queries about sensitive historical and political themes (from ~72% to nearly 0% for its DeepSeek-based variant) while maintaining factual accuracy and neutrality.

The model positions itself within the open-weight ecosystem as a solution for developers and organizations in Japan requiring an AI that provides balanced, multi-perspective responses without automatic self-censorship on culturally specific topics. It inherits the frontier-level performance of the base 405B model—competitive with GPT-4o on general knowledge and math benchmarks—while offering the typical advantages of open weights: self-hosting for data privacy, potential for domain-specific fine-tuning, and cost-efficient deployment at scale. However, its significant computational demands limit practical deployment to well-resourced entities or via API providers.

분석 생성일: 2026-05-23