Subscribe to Our Newsletter

Success! Now Check Your Email

To complete Subscribe, click the confirmation link in your inbox. If it doesn’t arrive within 3 minutes, check your spam folder.

Ok, Thanks

DeepSeek launches V4 preview with 1.6 trillion-parameter flagship and aggressive pricing

The Chinese AI lab says its largest model approaches frontier performance while costing a fraction of closed-source rivals

Defused News Writer profile image
by Defused News Writer
DeepSeek launches V4 preview with 1.6 trillion-parameter flagship and aggressive pricing

DeepSeek, the Chinese artificial intelligence laboratory whose V3 and R1 models upended the industry a year ago, has launched preview versions of its fourth-generation model family, introducing V4 Pro and V4 Flash with one million-token context windows and open-source weights.

V4 Pro is a mixture-of-experts (MoE) model totalling 1.6 trillion parameters with 49 billion active per forward pass, making it the largest open-weight model available, more than double the size of its predecessor V3.2 at 671 billion parameters and surpassing Moonshot AI's Kimi K 2.6 at 1.1 trillion and MiniMax's M1 at 456 billion.

V4 Flash follows the same architecture at a smaller scale, with 284 billion total parameters and 13 billion active, designed for faster and cheaper inference.

The MoE approach activates only a subset of parameters per task, keeping computational costs down despite the models' headline scale.

DeepSeek said its V4-Pro-Max variant outperforms all open-source peers across reasoning benchmarks and exceeds OpenAI's GPT-5.2 and Google's Gemini 3.0 Pro on some tasks, while coding competition performance is "comparable to GPT-5.4."

Both V4 models are described as more efficient and performant than V3.2, with the company saying they have nearly "closed the gap" with current frontier models.

However, the lab acknowledged that the models lag behind GPT-5.4 and Gemini 3.1 Pro on knowledge benchmarks, noting a "developmental trajectory that trails state-of-the-art frontier models by approximately 3 to 6 months."

Both V4 Flash and V4 Pro support text only, unlike many closed-source competitors that offer multimodal capabilities across audio, video and images.

DeepSeek has deployed V4 Pro internally as the company's coding agent of choice, and employee feedback indicates performance surpassing Claude Sonnet 4.5, with output quality approaching Claude Opus 4.6 in non-thinking mode.

The models introduce a novel attention mechanism that compresses along the token dimension, combined with DeepSeek Sparse Attention, to achieve long-context performance while reducing computational and memory requirements.

One million tokens is now the standard context length across all DeepSeek services.

Pricing undercuts every frontier model currently available.

V4 Flash costs $0.14 per million input tokens and $0.28 per million output tokens, while V4 Pro costs $0.145 per million input tokens and $3.48 per million output tokens, substantially below GPT-5.4, Gemini 3.1 Pro and Claude Opus 4.7.

The launch comes a day after the United States accused China of stealing American AI laboratories' intellectual property on an industrial scale using thousands of proxy accounts, and follows accusations from Anthropic and OpenAI that DeepSeek has been "distilling" their models.

Both models are available now at chat.deepseek.com and via the API, with open weights published on Hugging Face.

The legacy model names deepseek-chat and deepseek-reasoner will be retired on 24 July, after which they will no longer be accessible.

The recap

  • DeepSeek released two preview models, V4 Flash and V4 Pro.
  • V4 Pro totals 1.6 trillion parameters, 49 billion active.
  • Lab said development trails frontier models by 3 to 6 months.
Defused News Writer profile image
by Defused News Writer

Explore stories