China-Developed Large AI Models Are Still Weak in Complex Reasoning, Shanghai AI Lab Says

Liu Xiaojie

DATE: Jan 31 2024

/ SOURCE: Yicai

China-Developed Large AI Models Are Still Weak in Complex Reasoning, Shanghai AI Lab Says

(Yicai) Jan. 31 -- Large artificial intelligence language models developed by Chinese tech firms still lag behind US AI firm Open AI’s GPT-4 Turbo in complex reasoning but are competitive in terms of knowledge base and language capabilities, especially in Chinese, according to a recent study.

Chinese chatbots, such as Zhipu AI’s GLM-4, Alibaba Group Holding’s Qwen-Max and Baidu’s Ernie Bot 4.0, scored just below GPT-4 Turbo in a large AI model evaluation carried out by the Shanghai AI Laboratory, which released the latest version of its open-source evaluation system OpenCompass 2.0 yesterday.

But even with a small gap, it does not mean that they have the same abilities as GPT-4 Turbo, Chen Kai, a scientist at the lab, told Yicai. The scores comprise many aspects, and while China-developed large language models are close to GPT-4 Turbo in terms of knowledge base and language capabilities, they still have a long way to go to catch up in reasoning ability.

And even GPT-4 Turbo only scored 61.8 points out of 100, just above the pass rate, indicating that there is still a lot of room for chatbots to improve, the lab said, adding that the study did not include all large AI model developers, and more new models will be evaluated next time.

The ability to carry out complex reasoning determines how reliable a large AI model is, said Lin Dahua, a scientist at the lab. For instance, it must not make mistakes in finance. When used to analyze a company’s financial statements or industrial technical documents, if a chatbot’s mathematical calculation and analysis capabilities are inadequate, this will become a technical barrier.

“Many China-developed chatbots are only used in customer service and for chatting. Talking nonsense when chatting does not have an adverse impact, but such large models cannot be applied in serious business situations,” Lin said.

The Shanghai AI Lab first launched OpenCompass in July last year and it is one of four large AI model evaluation tools recommended by US tech giant Meta and the only one developed by a Chinese firm.

Editors: Tang Shihua, Kim Taylor

Follow Yicai Global on

Keywords: GPT-4 Turbo,OpenCompass2.0,Shanghai Artificial Intelligence Laboratory,LLM

Report

Log in to Yicai Global

EMAIL

0/50

PASSWORD

Forgot password? sign up

Create your account

EMAIL

By signing up, you agree to our Terms, Privacy Policy

We sent you a code

Enter it below to verify via ****@****.com

VERIFICATION CODE

Didn't receive email? Resend email

You'll need a password

Make sure it's 8 characters or more

PASSWORD

SHOW PASSWORD/HIDE PASSWORD

Success!

Welcome to Yicai Global

Find your Yicai Global account

Enter your email

Check your email

We've sent an email to *********@q*.*** with a confirmation code.

Enter the code below to reset your password.

If you don't see the email, check your junk, spam or other folders.

Enter code

Didn't receive email?Resend email

Change your password

Strong passwords include numbers,letters,and special characters.

Resetting your password will log you out of all your active Yicai Global sessions.

Enter your new password

Enter your new password again

Congratulations!

Your password has been changed successfully.

Reset your password

Strong passwords include numbers,letters,and special characters.

RELATED

Log in to Yicai Global

EMAIL

PASSWORD

Create your account

EMAIL

We sent you a code

VERIFICATION CODE

You'll need a password

PASSWORD

Find your Yicai Global account

Enter your email

Check your email

Enter code

Change your password

Enter your new password

Enter your new password again

Reset your password

Enter your new password

getcode