이 모델의 강점은 무엇인가요?

경량화된 0.9B 파라미터 개방된 사용 라이선스 효율적인 멀티모달 처리

이 모델의 약점은 무엇인가요?

상대적으로 작은 모델 규모 특정 애플리케이션에 치중하는 경향 제한된 일반 추론 능력

어떤 용도에 가장 적합한가요?

멀티모달 데이터 분석 OCR 기능 통합 엣지 장치에서의 시각 처리

모델 목록으로

バイドゥ오픈소스

PaddleOCR-VL-1.5

Name: PaddleOCR-VL-1.5
Author: バイドゥ

PaddleOCR-VL-1.5는 바이두가 개발한 멀티모달 대형 모델입니다. 약 0.9B 파라미터 규모를 가지며, Apache 2.0 라이선스 하에 출시되었습니다.

파라미터

0.9B

컨텍스트

라이선스

Apache 2.0

출시일

2026-01-29

API 가격

이 모델의 API 가격 정보는 현재 공개되지 않았습니다

강점

・경량화된 0.9B 파라미터
・개방된 사용 라이선스
・효율적인 멀티모달 처리

약점

・상대적으로 작은 모델 규모
・특정 애플리케이션에 치중하는 경향
・제한된 일반 추론 능력

활용 사례

・멀티모달 데이터 분석
・OCR 기능 통합
・엣지 장치에서의 시각 처리

심층 분석

Architecture

VLM (0.9B)

Ultra-compact document parsing model

Accuracy

94.5% on OmniDocBench v1.5

SOTA for document parsing

License

Open-source

Release Date

January 2026

Specialization

OCR + Document Parsing

Multi-task VLM

Key Features

Seal recognition, text spotting

New capabilities in v1.5

강점

・94.5% SOTA accuracy on OmniDocBench v1.5
・Ultra-compact 0.9B parameters
・Robust against real-world distortions (scanning, skew, warping)
・Seal recognition and text spotting
・Multilingual including Tibetan and Bengali
・Cross-page table merging

약점

・Specialized for document parsing only
・Not a general-purpose language model
・Limited to OCR-related tasks

경쟁사 비교

Model	Arena	SWE	GPQA	Price
PaddleOCR-VL (v1)	-	-	-	Free
Surya OCR	-	-	-	Free
Google Document AI	-	-	-	Paid

개요

PaddleOCR-VL-1.5 is Baidu's ultra-compact 0.9B VLM for document parsing, achieving 94.5% SOTA accuracy on OmniDocBench v1.5. It handles real-world distortions, seal recognition, text spotting, and multilingual OCR including Tibetan and Bengali.

벤치마크 및 성능

SOTA on OmniDocBench v1.5 and Real5-OmniDocBench. Superior to mainstream open-source and proprietary models on scanning, skew, warping, screen-photography, and illumination scenarios.

상세 비교

Significantly more compact than alternatives while achieving SOTA accuracy. Best-in-class for real-world document parsing robustness.

커뮤니티 평가

Part of the PaddleOCR ecosystem. Available on HuggingFace and Baidu AI Studio.

활용 사례

Document digitization, OCR pipelines, invoice/receipt processing, multilingual document understanding, and seal/stamp recognition.