经济学人 | 中国AI又迎来一个高光时刻,但仍落后于美国

China is having another AI moment

A new model has narrowed the gap with America

中国AI的又一个高光时刻

新模型缩小了与美国的差距

The Economist

China | Catching up

本文探讨中美人工智能竞赛（AI race）的最新态势，聚焦中国AI模型GLM 5.2的发布对全球AI格局的冲击(disruption)，并从能力、成本、可靠性三个维度分析中美AI的差距与趋势。

题目be having a moment意思是 to be having a time of success or popularity

【信息要点】

1. 在Deepseek之后，中国再度冲击AI赛场：北京智谱AI（Zhipu/Z.ai）发布GLM 5.2，是迄今最强中国AI模型，性能排名全球第四，成本不足Anthropic同类产品的十分之一，且开源发布，承诺"让每个人离前沿智能再近一步"（promising 'a step closer to frontier intelligence for everyone'）。相比起Deepseek，美国市场迄今对新模型兴趣寥寥，部分是因为准确评估中国模型变得愈发困难。

2. 美国仍保持领先，但差距在缩小：Anthropic公司的Fable 5在基准测试中平均领先GLM 5.2约17%，但私有基准测试显示中国模型实际落后8至10个月(about 8 to 10 months behind)——比公开测试显示的4至6个月差距更大。有外国机构还称中国实验室为在测试中取得高分对AI进行应试教育（teach to the test）.

3. 美国政府干预成新变量：特朗普政府禁止非美国人使用Fable 5，迫使Anthropic下架产品（remove the model from service），这是首次由一国政府单方面控制前沿AI的访问权（For the first time, access to frontier ai rests on one government's say-so），或推动全球用户重新考虑对美国AI的依赖度（rethink their dependence on American AI）从而转向中国替代品。

4. 文章指出，便宜并非绝对优势：中国模型虽单价低，但推理效率低——同等任务DeepSeek消耗的token数是OpenAI的23倍，综合成本未必更低。

5. 差距稳定但未扩大：最新评测显示美国领先优势持续，但未如预期进一步拉大。Anthropic的Fable 5已经强大到引发了政府干预，而中国的AI模型还未引发ZF监管，作者认为这也反映了中国AI落后于美国。

America's lead over China in artificial intelligence may beat its smallest in over a year. When China disrupted the ai race in January 2025 with the release of DeepSeek r1, it erased $1trn from America's capital markets.

美国在人工智能领域对中国的领先优势，可能已缩小至一年多来的最低水平。2025年1月，中国发布DeepSeek r1，搅动了AI竞赛格局，令美国资本市场蒸发1万亿美元。

Nvidia, a chip firm, briefly shed 17% of its value; the Nasdaq sank by 3.1% in a day.

芯片巨头英伟达股价一度暴跌17%，纳斯达克指数单日下跌3.1%。

American investors were troubled not only because Chinese ai was good, but because it was being given away free. The uproar soon faded. Since then, market valuations everywhere have hinged ever more onthe promise that ai will be both revolutionary and profitable.

美国投资者的担忧不仅在于中国AI性能出色，更在于它是免费开放的。然而这场喧嚣很快平息。此后，各地市场估值愈发押注于AI将同时具备革命性与盈利性这一承诺。"

Now Chinese labs are unsettling their American rivals anewin the race to monopolise the market for models. On June 13th a Beijing-based lab called Zhipu, or Z.ai, announced its latest system, glm 5.2, promising 'a step closer to frontier intelligence for everyone'. It is the most capable Chinese-trained model to date and runs at less than a tenth of the price of Anthropic's latest release, Fable 5. And as with other Chinese models the weights, or parameters, that enable glm 5.2 to function have been publicly released. With this new model, China is competing on ability, cost and openness.

如今，中国实验室再度令美国竞争对手坐立不安，双方在争夺模型市场垄断地位的竞赛中再起波澜。6月13日，北京实验室智谱AI（Z.ai）发布最新系统GLM 5.2，承诺"让每个人离前沿智能再近一步"。这是迄今为止能力最强的中国训练模型，运行成本不足Anthropic最新产品Fable 5的十分之一。与其他中国模型一样，使GLM 5.2运行的权重（即参数）已公开发布。凭借这款新模型，中国正在能力、成本和开放性三个维度上与美国展开竞争。"

In recent weeks American companies have been grappling with soaring ai costs, sometimes ranging into the thousands of dollars per employee. Some firms are setting budgets for tokens (bits of text processed by a model). Then on June 12th the Trump administration banned non-Americans from using Fable 5, leading Anthropic to remove the model from service.For the first time, access to frontier ai rests on one government's say-so. All this may give users reasons to look at alternatives to American ai. Many will find glm 5.2 capable and affordable, and welcome that it is out of the Trump administration's reach.

近几周，美国企业一直苦于应对飙升的AI成本，有时每名员工的花费高达数千美元。一些公司开始为token（模型处理的文本片段）设定预算。随后，6月12日，特朗普政府禁止非美国人使用Fable 5，导致Anthropic将该模型下架。这是有史以来首次，前沿AI的访问权取决于一国政府的一句话。这一切或许将促使用户寻找美国AI的替代品。许多人将发现GLM 5.2既能干又实惠，并对其不受特朗普政府管辖感到欣慰。"

Start with capability. Artificial Analysis, a research firm, ranks glm 5.2 as the most intelligent open-source model on the market. glm 5.2 takes an impressive fourth place on its overall list, behind Openai's Chatgpt 5.5 and ahead of Google's Gemini bot. The model has surprised everyone. Earlier this year Chinese developers were pessimistic about the prospect of their models outclassing American ones before 2030. After Zhipu's release, Elon Musk wrote on X that he expects China to match the abilities of the current frontier by early next year. It 'won't take that long', Tang Jie, Zhipu's co-founder, shot back.

先来看能力维度。研究机构Artificial Analysis将GLM 5.2列为市场上最智能的开源模型。GLM 5.2在其总榜中以令人印象深刻的第四名落座，位于OpenAI的ChatGPT 5.5之后、谷歌Gemini之前。这款模型让所有人大吃一惊。今年早些时候，中国开发者对本国模型能在2030年前超越美国模型还持悲观态度。智谱发布GLM 5.2后，埃隆·马斯克在X平台发文称，他预计中国将在明年初与当前前沿水平持平。智谱联合创始人唐杰随即反击："不需要那么久。""

Unlike in the DeepSeek moment, American markets have so far shown little interest in glm 5.2. This is partly because it has become more difficult to assess Chinese models accurately. To arrive at its estimates, Artificial Analysis scored glm 5.2 on dozens of benchmark tests, which use exam-like questions to evaluate a model's smarts. America, via Anthropic, keeps its edge in performance. Fable 5 is about 17% cleverer than glm 5.2 across an average of benchmark tasks. The other important metric is how long it took glm 5.2 to reach this level of intelligence. A comparable Western model to glm 5.2 was released in February, or about four months ago.

与DeepSeek事件时不同，美国市场迄今对GLM 5.2兴趣寥寥。这部分是因为准确评估中国模型变得愈发困难。为得出估算结果，Artificial Analysis对GLM 5.2进行了数十项基准测试——这些测试使用类似考试的题目来评估模型的智能水平。美国通过Anthropic依然保持性能领先优势。Fable 5在各项基准任务平均水平上比GLM 5.2聪明约17%。另一个重要指标是GLM 5.2达到这一智能水平所用的时间。与GLM 5.2水平相当的西方模型于2月发布，即约四个月前。"

In reality, America's lead is probably bigger than four months. Open-source models, many of them Chinese, tend to score better on public benchmarks than private ones, says Havard Tveit Ihle of the Norwegian Defence Research Establishment. The questions used in public benchmark tests are published, whereas those who apply private benchmarks keep their evaluations secret. Analysis by Dr Tveit Ihle found that Chinese models were about four to six months behind American ones on public tests. But on private tests America's lead nearly doubled, to eight to ten months. A study by the American government, released in May, identified a similar gap. Mr Tveit Ihle says Chinese labs appear, possibly unwittingly, to 'teach to the test'.

现实情况是，美国的领先优势可能不止四个月。挪威国防研究所的Havard Tveit Ihle表示，开源模型（其中许多来自中国）在公开基准测试中的得分往往优于私有测试。公开基准测试的题目是公开的，而使用私有基准测试的机构则对评估内容保密。Tveit Ihle博士的研究发现，中国模型在公开测试中落后美国约四至六个月，但在私有测试中，美国的领先优势几乎翻倍，达八至十个月。美国政府5月发布的一项研究也发现了类似差距。Tveit Ihle先生表示，中国实验室似乎在不自觉地"应试教育"。"

Given the uncertainties surrounding the true capabilities of Chinese models, next consider whether they are truly cheaper than their American rivals. DeepSeek charges just $0.87 per 1m output tokens for its v4 model, whereas Anthropic charges $50 for the same on Fable 5. Such prices might have a growing appeal in America, where token costs at some firms have run out of control. In June DeepSeek saw a sharp rise in American firms paying for its services. Yet this most important assumption, that Chinese ai is cheaper, can frequently be wrong.

鉴于中国模型真实能力存在诸多不确定性，下面再来考量一个问题：它们是否真的比美国竞品更便宜？DeepSeek的v4模型每百万输出token仅收费0.87美元，而Anthropic的Fable 5同等服务收费50美元。这样的价格在美国或许越来越有吸引力——部分企业的token开销已经失控。6月，DeepSeek迎来美国企业付费使用量的大幅增长。然而，"中国AI更便宜"这一最重要的假设，实际上常常并不成立。"

Though Chinese models are becoming more capable, they are generally not becoming more efficient. Chinese models use many more tokens to think through their answers. A study updated this month by Du Zheng of Georgia Tech shows that given the same tasks, a DeepSeek model used 23 times more tokens than its Openai rival to achieve basically the same result. Because of these large differences in efficiency, the correct way to compare models is not price per token but the total cost of all the tokens used. Using this metric, on a benchmark designed to test software engineering, glm 5.2 ended up costing more than systems from Anthropic and OpenAI.

尽管中国模型能力日益增强，但总体上并未变得更高效。中国模型需要消耗更多token来"思考"作答。佐治亚理工学院杜峥本月更新的一项研究显示，在相同任务下，DeepSeek模型消耗的token数量是OpenAI竞品的23倍，而最终结果基本相同。由于效率差距如此巨大，比较模型的正确方式不是每token的价格，而是完成任务所消耗全部token的总费用。以此为衡量标准，在一项软件工程基准测试中，GLM 5.2的实际费用甚至高于Anthropic和OpenAI的系统。"

In addition to capability and cost, a third selling-point is now top of mind for ai users: reliability. Zhipu released its model at 5:21pm Beijing time on June 13th, one day after the Trump administration told Anthropic that it was banning non-Americans from using Fable 5. 'Our attitude is one of radical openness,' Mr Tang declared. He also blasted 'external blockades', such as the one imposed by Anthropic and the American government, saying they made ai systems 'subject to revocation at any moment'. The Fable 5 shutdown could help Chinese labs as firms around the world rethink their dependence on American AI.

除能力和成本之外，AI用户如今最关心的第三个卖点是：可靠性。智谱于北京时间6月13日下午5点21分发布模型——就在特朗普政府通知Anthropic禁止非美国人使用Fable 5的次日。唐杰宣称："我们的态度是彻底开放。"他还强烈批评了"外部封锁"——例如Anthropic与美国政府所施加的限制——称这使AI系统"随时可能被撤销"。Fable 5的下架或将助力中国实验室，因为全球企业正在重新审视自身对美国AI的依赖。"

As the ai race speeds up, regulators everywhere will be faced with new challenges to safety and security. The risk of sudden government intervention may grow. Fable 5 was powerful enough to prompt such a response. That Chinese models are not, for now, facing similar regulatory risk suggests China's government is not yet alarmed enough to act. That may be some of the clearest evidence that they remain behind their rivals.

随着AI竞赛加速，各地监管机构将面临安全与保障方面的新挑战。政府突然干预的风险或将上升。Fable 5的能力强大到足以引发如此反应。而中国模型目前尚未面临类似监管风险，这恰恰说明中国政府还未紧张到需要采取行动的程度。这或许是中国仍落后于竞争对手的最有力证据之一。"

往期回顾

在国外获大奖影片，临上映前遭全网抵制！《监狱里的妈妈》凉了，《经济学人》却说网友玻璃心

读外刊看世界 | 学校课程在AI时代落伍了？为何越来越多人选择home-schooling（在家上学）

外刊速递 | 存储大厂造富神话！同行奖金或达年薪 6 倍，薪酬落差之下三星员工发起罢工抗议