} ?>
(Yicai) May 29 -- As DeepSeek drives artificial intelligence models toward practical applications, demand for AI computing from mainstream users in China is gradually shifting to inference, creating a new market opportunity for domestic chips, according to industry insiders.
Chinese data centers have until now mainly used Nvidia’s semiconductor chips because of the high-performance requirements for AI model training, with only a few using domestically-developed alternatives, mostly as backup, Zhou Zhengang, vice president of International Data Corporation China, told Yicai.
But following the release of the open-source, high-performance DeepSeek-R1 model, more users have begun deploying large language models in real-world scenarios, driving demand for AI inference computing power, Zhou noted, adding that this shift creates an opportunity for less powerful and more affordable homegrown chips.
DeepSeek-R1 was released by its Chinese developer in January, an event described by Silicon Valley venture capitalist Marc Andreessen as an “AI Sputnik moment,” as the AI delivered performance on par with leading LLMs but at much lower cost. Hangzhou-based startup DeepSeek released an upgrade on open-source community Hugging Face earlier today.
AI inference requires less computing power than AI training, making chips with performances lower than Nvidia’s H100 and H800 viable, said Rocky Cheng, chief executive of Cyberport’s Artificial Intelligence Supercomputing Center, which is Hong Kong's largest AI supercomputing hub mainly serving local universities, research institutions, and enterprises.
In the context of AI, inference refers to the process whereby a trained AI model uses the knowledge it has learned to make predictions or decisions on new and previously unseen data.
Chinese internet firms and telecoms carriers procured large quantities of domestic compute accelerator cards last year but quickly found them to be unpopular with users, Zhou said. However, after DeepSeek-R1 was adopted on a large scale, all these resources were put into use in the first quarter of the year.
“Teams developing LLMs like DeepSeek-R1 will gradually decrease, and demand for inference capabilities will become mainstream,” Cheng noted. “Our center’s next-phase computing configuration will prioritize this growing need.”
“Domestically developed chips will account for over 40 percent of compute cards deployed in Chinese data centers in the first half of this year and exceed 50 percent soon,” Zhou predicted. “This was unimaginable just two years ago.”
Last year, more than 65 percent of the cards used in Chinese data centers were made by California-based Nvidia.
Editor: Futura Costaglione