What are the strengths of this model?

Large 150B parameter scale Advanced visual understanding capability Open Apache 2.0 license

What are the weaknesses of this model?

High computational cost due to scale Resource consumption during inference Hardware requirements for operation

What are the best use cases?

Advanced image analysis Understanding and processing visual information Open-source visual AI development

Back to Models

StepFunAIOpen Source

NextStep-1.1

Name: NextStep-1.1
Author: StepFunAI

NextStep-1.1 is a vision large model (VLM) developed by StepFunAI. It is a foundation model with a parameter scale of approximately 150.0B and is released under the Apache 2.0 license.

Parameters

150.0B

Context Window

License

Apache 2.0

Release Date

2025-12-24

API Pricing

API pricing for this model is not yet available

Strengths

・Large 150B parameter scale
・Advanced visual understanding capability
・Open Apache 2.0 license

Weaknesses

・High computational cost due to scale
・Resource consumption during inference
・Hardware requirements for operation

Use Cases

・Advanced image analysis
・Understanding and processing visual information
・Open-source visual AI development

Deep Analysis

Architecture

Autoregressive Image Generation

Continuous token generation

Training Steps

500K total

200K→500K from NextStep-1

Resolutions

256px + 512px

Multi-resolution training

Release Date

2025

ICLR 2026 Oral

Post-Training

NextStep-GRPO

Stabilized reinforcement learning

License

Open-source

Strengths

・ICLR 2026 Oral presentation
・SOTA autoregressive image generation with continuous tokens
・Addresses instability issues of original NextStep-1
・Significant improvement in image quality and text rendering
・Open-source research project

Weaknesses

・Research model, not production-ready
・Limited to image generation
・Requires significant compute for training

Competitor Comparison

Model	Arena	SWE	GPQA	Price
NextStep-1	-	-	-	Free (open)
DALL-E 3	-	-	-	Paid
Stable Diffusion 3	-	-	-	Free (open)

Overview

NextStep-1.1 is StepFun's improved autoregressive image generation model, presented as an ICLR 2026 Oral paper. It addresses visualization failures from NextStep-1 through extended training (500K steps) and a stabilized post-training strategy (NextStep-GRPO).

Benchmarks & Performance

Substantial improvements over NextStep-1 in image quality, text rendering, and training stability. SOTA for autoregressive continuous-token generation.

Detailed Comparison

Represents the state-of-the-art in autoregressive image generation research. Unique approach compared to diffusion-based models.

Community Feedback

ICLR 2026 Oral. Active research community. Pretrain checkpoint available on HuggingFace.

Use Cases

Research into autoregressive image generation, high-quality image synthesis, and text-to-image generation with strong text rendering.

Latest News

Presented at ICLR 2026 as Oral. NextStep-GRPO addresses key stability issues in RL-based post-training for autoregressive generation.

Sources

Analysis generated: 2026-05-24