China's DeepSeek to Make 75% Discount on Flagship V4-Pro AI Model Permanent

Zhang Yushuo

DATE: 5 hours ago

/ SOURCE: Yicai

China's DeepSeek to Make 75% Discount on Flagship V4-Pro AI Model Permanent

(Yicai) May 25 -- DeepSeek said the Chinese artificial intelligence startup will permanently cut the application programming interface price for its flagship model V4-Pro to a quarter of the original, topping the global bang-for-buck ranking.

The API price of the V4-Pro will stay at 2.5 Chinese cents (0.36 US cents) per one million cache-hit input tokens, a discount that was set to expire at the end of this month, Hangzhou-based DeepSeek announced on May 22. The cost was set at CNY3 (44 US cents) per million cache-miss input tokens and CNY6 per million output tokens.

DeepSeek released the V4 next-generation flagship model on April 24, featuring broad upgrades in inference performance, long-context processing, and agentic capabilities. The V4-Pro has achieved the best performance among open-source models on Agentic Coding benchmarks, with output quality that approaches Claude Opus 4.6 in non-thinking mode based on internal evaluations, according to the company.

The move from DeepSeek runs counter to a broader industrywide trend of price increases, with Amazon, Microsoft, and some major Chinese cloud providers raising API prices by as much as 463 percent due to growing compute costs. High-bandwidth memory prices have surged more than six times over the past six months, while a sharp rise in inference-side token consumption driven by AI agents has pushed operating costs beyond what cloud providers can absorb through subsidies alone.

DeepSeek's price cut did not come from subsidized discounting, but from structural cost reductions achieved through fundamental architectural redesign, Securities Daily reported, citing analysts. Its proprietary sparse attention mechanism and mixture-of-experts architecture allow the V4 series to handle million-token long-context tasks at just 27 percent of the compute cost of its predecessor, with key-value cache memory usage reduced to 10 percent, they pointed out.

DeepSeek has also deeply optimized its models for domestic AI chips, including Huawei Technologies' Ascend series, significantly lowering hardware procurement costs, the analysts said, adding that engineering improvements on the inference side have also allowed fixed costs to be spread across a larger usage base.

In addition, the price cut is a deliberate move by DeepSeek to lock in the ecosystem position. By dramatically lowering the barrier to API access, the company is drawing developers and enterprise users to build on its platform, aiming to establish a self-reinforcing cycle of lower prices, growing usage, a flourishing application ecosystem, and further cost reduction.

Editor: Martin Kadiev

Follow Yicai Global on

Keywords: DeepSeek,API pricing,large language model,HBM,GPU,AI inference,MoE,Ascend,token cost,China AI

Report