Tokens Are Becoming New Standard to Measure Tech Firm’s Competitiveness
Liu Jia | Zheng Xutong | Lv Qian | Chen Yangyuan
DATE:  an hour ago
/ SOURCE:  Yicai
Tokens Are Becoming New Standard to Measure Tech Firm’s Competitiveness Tokens Are Becoming New Standard to Measure Tech Firm’s Competitiveness

(Yicai) April 9 -- Tokens are gradually becoming the core pricing and circulation unit for artificial intelligence services, impacting and influencing the business models of technology companies and becoming a new standard for measuring their competitiveness.

The average daily token use of China’s artificial intelligence models exceeded 140 trillion last month, compared with 100 billion at the beginning of 2024, according to data released at the China Development Forum on March 23.

AI Is Transforming Tech Firms’ Business Models

User scale was once an indisputable parameter in the mobile internet era. As long as the user traffic was large enough, monetization through advertising, e-commerce, games, and value-added services would be a natural outcome.

However, data suggests a shift in the landscape. The average daily token usage of ByteDance’s Doubao large language model surpassed 120 trillion in March. The annual recurring revenue of Z.ai’s Mobility-as-a-Service application programming interface was CNY1.7 billion (USD248.8 million) last year, a 60-fold increase from the year before, with the gross margin rising nearly fivefold to 18.9 percent.

ByteDance’s daily average token consumption has exceeded 100 trillion, joining the only two other tech companies globally to achieve such a milestone -- OpenAI and Google. ByteDance has achieved a 1,000-fold growth in the past two years, and the number of enterprises using Volcano Engine, with over one trillion token consumption each, has jumped to 140.

The rapid growth in token consumption is mainly because of video creation, as all industries have marketing needs, and popular agent products such as OpenClaw, Tan Dai, president of ByteDance’s cloud service business Volcano Engine, told Yicai.

A year ago, people would still categorize the applications of AI by industry, but now, products like OpenClaw have made AI accessible to every employee for purposes including recruitment, market analysis, and weekly reports, Tan noted, adding that it is no longer possible to define AI’s application scenarios in a single way.

Other AI companies, such as Z.ai, MiniMax, and Moonshot AI, have no cloud computing ecosystem or super entrances to the consumer-end, such as WeChat and Douyin, but have demonstrated the commercial value of tokens in the business-end segment.

Z.ai raised the price of its API by 83 percent in the first quarter of this year from a year earlier, while its usage volume surged 400 percent in the period, achieving an increase in both usage volume and price.

As apps like OpenClaw drive token consumption into exponential growth, the re-centralization of inference and the export of high-quality tokens will become new trends, said Zhang Peng, chief executive officer of Z.ai. “When an LLM is strong enough, the API itself is the best business model.”

For LLMs, user volume does not inherently lead to an increase in token consumption. A heavy-duty AI developer might consume more tokens in a day than 10,000 ordinary C-end users. User loyalty is determined by model efficiency, cost-effectiveness, and stability.

“Cloud computing providers have a dual identity: they are LLM providers and can produce tokens,” Li Qiang, vice president of Tencent Holdings, told Yicai. “Independent LLM providers need a large amount of computing power to produce tokens, which is mainly provided by cloud computing providers.”

In the past, demand mainly came from top LLM companies and embodied AI firms, Li noted, adding that with OpenClaw promoting AI from the dialogue layer to the execution layer, it will boost the willingness of B-end users.

Agents can replace humans in more core areas, including contract review, tender document writing, video reading, and internal auditing, he pointed out. AI-to-business has become the largest incremental market for all cloud vendors.

The significance of token consumption in the cloud business is also on the rise. MaaS revenue will become the largest income source of Alibaba Cloud Intelligent Group, predicted Wu Yongming, CEO of Alibaba Group Holding. Following Alibaba’s establishment of the Alibaba Token Hub business group led by Wu, Tencent also upgraded its MaaS platform to TokenHub.

However, the token data of traditional industry giants, such as Tencent, Alibaba, and Baidu, remains rather vague or even absent.

Token Consumption Is Not the Only Criterion

The number of tokens is not the sole unit of measure to evaluate AI. “Everyone must not think that tokens are the same,” Liu Weiguang, senior vice president of Alibaba Cloud and president of its public cloud business, told Yicai.

With the same 1,000 tokens, the value created by AI writing a funny copy or engaging in casual chat is vastly different from that of a customer service agent using AI to solve real problems. The former is more about emotional value, while the latter is directly related to cost reduction and corporate efficiency improvement.

If one model takes 100 sentences and 100,000 tokens to explain one thing, while an advanced model can do it in just five sentences and 1,000 tokens, it is obvious which one has greater value, an industry expert said. “Enterprises pay for tokens,” they noted. “Essentially, they are paying for enhanced productivity, reduced costs, and optimized efficiency.”

Tencent is not only focused on token consumption, Li explained to Yicai. “Assuming tokens are fuel, if you only focus on fuel consumption without considering the economic efficiency of building the engine, the cost for users may be very high, and they will eventually abandon it,” he said.

Tokens are not sticky businesses, Li noted, adding that using low prices to attract customers is not a good strategy, as customers can leave if no more discounts are offered.

It is better to concentrate on whether an agent as user-friendly as and more secure than OpenClaw can be developed rather than focusing on tokens, Li believes. Efforts should be placed on engine research and development to ensure a better engine and lower fuel consumption to take the lead in token acquisition.

There is still massive token usage in ‘invisible areas.’ For instance, financial institutions download open-source models to perform local tasks, such as bill recognition and risk control, which are not counted by clouds. In-vehicle smart cockpit models complete conversations in a closed loop within the vehicle to protect privacy.

The volume of API calls by non-public clouds is at least five to 10 times that of public clouds, an expert predicted. “The biggest clients don’t use public APIs,” the expert told Yicai. “They use them within their own environments.”

Token-Based Billing Is Only a Phase

Behind the latest surge in token prices, there are both demand factors, such as a sharp increase in token usage, and supply-side reasons, especially the significant rise in core hardware procurement costs.

A staffer at an AI platform told Yicai that graphics processing unit chips are the core cost for LLM inference, and electricity bills are also a real expense. The electricity bill for a large inference cluster in a year is an astronomical figure, the person noted.

In addition, there are costs for research and development amortization, operation and maintenance, and security, the staffer said, adding that he believes that there is still a lot of room for token prices to drop, as chip computing power is increasing, model efficiency is improving, and the scale effect of infrastructure is accumulating.

However, cheap tokens do not equate to user-friendly AI. When the price of tokens is no longer a barrier, the focus of competition will shift to model capabilities, response speed, customization, and the understanding of specific industries.

Many industry insiders believe that charging users by token might just be a transitional stage, while the ideal model in the future will be to pay based on results.

Low-level tokens for simple conversations and lightweight tasks will move towards a low-cost or even free model, while high-level tokens with high complexity, reliability, and productivity capabilities will maintain their pricing power, said Liu Debing, chairman of Z.ai.

Software was mainly based on subscription, while interfaces will change to an API call model, with payment based on demand, said Zhu Hong, chief technology officer of Alibaba’s workplace messaging app DingTalk. DingTalk’s OpenClaw-like product Wukong will also consider business models such as payment based on demand and payment based on effect, he noted.

Forecast tokens may disappear from the common users’ view in five years, but their value may exist in other forms, said Zhang Ting, head of products at Baidu’s Qianfan platform. With the development of multimodal, the definition of tokens will expand to image, audio, and video tokens, and the measurement unit will be more complex, Zhang added.

From this perspective, the market has only just begun. Most manufacturing enterprises have not truly adopted AI yet, and the application depth of AI in industries, such as finance, healthcare, and education, is insufficient. AI-native enterprises are still on the eve of a major breakthrough, and tokens are merely the most fundamental units of AI.

Editor: Futura Costaglione

Follow Yicai Global on
Keywords:   AI,Token