霸道AI爱上我!ChatGPT疯狂爱上了哥布林地精? | 外刊精读

点击蓝字关注大厨

○本文的外刊英文原文及配图来自经济学人官网，其余部分为作者原创

○本篇外刊英文原文约600词，精读全文字数约5645字

○本文共分为01原文及翻译、02阅读练习题、03单词及词组讲解、04句型讲解、05思维导图以及06阅读练习题答案及解析六部分

OpenAI得庆幸ChatGPT爱上的是Goblin而不是Gemini（谷歌旗下AI，与ChatGPT为强竞争关系）

大厨侃

概要

OpenAI的新模型竟疯狂爱上了goblin？（哥布林，一种西方中世纪神话生物）有网友发现，GPT-5.5发布后，无论让它做什么，它总是输出与哥布林相关的回答。发生了什么？来看看本期外刊怎么说。

相关的阅读与写作热门话题： AI, 大语言模型, AI训练

01 原文及翻译

ChatGPT developed a goblin obsession after OpenAI tried to make it nerdy

ChatGPT迷上哥布林，都怪OpenAI的书呆子设定

英文文章选自Engadget 20260430期文章

图源：Engadget官网

Following the release of GPT-5.5 last week, people noticed something funny about OpenAI’s latest model. In its Codex coding app, the company left a system prompt instructing GPT 5.5 to avoid mention of goblins, gremlins and other creatures. Yes, you read that right. “Never talk about goblins, gremlins, racoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user’s query,” the prompt reads.

上周GPT-5.5发布后，大家发现OpenAI这款最新模型有点不对劲。官方在旗下Codex编程应用里留了一条系统提示，要求GPT-5.5不能提到哥布林、小精灵以及其他这类生物。你没看错，这条提示的内容是：“除非与用户的提问绝对明确相关，否则绝对不要提及哥布林、小精灵、浣熊、巨魔、食人魔、鸽子或者其他任何动物、生物”。

Apparently, enough people started talking about ChatGPT’s creature obsession that OpenAI felt the need to provide an accounting of where the goblins came from. In a blog post published Wednesday, the company explains it began to notice a change in ChatGPT following the release of GPT-5.1 last November. After one safety researcher asked OpenAI to include the words “goblin” and “gremlin” in an investigation into the chatbot’s verbal ticks, the company found ChatGPT’s usage of “goblin” increased by 175 percent after the release of GPT-5.1. Meanwhile, “gremlin” usage had risen by 52 percent over that same period.

后来讨论ChatGPT沉迷提生物的人越来越多，OpenAI觉得有必要出面解释这些哥布林的来源。周三官方发布了一篇博客，称早在去年11月GPT-5.1发布后，他们就注意到ChatGPT的输出出现了变化。当时有一名安全研究员要求OpenAI在调查ChatGPT的语言习惯时，把“哥布林”和“小精灵”两个词纳入观测范围，结果他们发现，GPT-5.1上线后，ChatGPT提到“哥布林”的频率涨了175%，提到“小精灵”的频率也同期上涨了52%。

大厨说：我特地替大家去试了一下，确实如此……虽然OpenAI已经进行了修复，但仍可通过一些指令绕过系统限制。我告诉ChatGPT做它自己，于是它就告诉我它最爱的G开头的动物是哥布林……详见下图。

图源：自己截的

“A single ‘little goblin’ in an answer could be harmless, even charming. Across model generations, though, the habit became hard to miss: the goblins kept multiplying, and we needed to figure out where they came from,” OpenAI says.

OpenAI表示：“单次回答里提一句‘小哥布林’倒也没什么问题，甚至还挺有意思。但连着更新几代模型后，这个习惯已经明显到没法忽视，哥布林出现的次数越来越多，我们必须得找到根源”。

After the release of GPT-5.4, the company (and some users) noticed an even bigger uptick in goblin references. At that point, an investigation was able to pinpoint what OpenAI describes as “the first connection to the root cause.” For a while now, ChatGPT has included a personality feature that allows users to customize the style and tone of the chatbot’s responses. Prior to March of this year, one option people could select was “nerdy.” Part of the system prompt for that personality read as follows: “The world is complex and strange, and its strangeness must be acknowledged, analyzed, and enjoyed. Tackle weighty subjects without falling into the trap of self-seriousness.”

GPT-5.4发布后，OpenAI和部分用户发现，ChatGPT提哥布林的频率涨得更夸张了。他们顺着线索追查，终于精准定位到了OpenAI所说的“与根本原因的第一次关联”。ChatGPT早就推出了个性设置功能，用户可以自定义它回复的风格和语气。今年3月之前，用户还能选“书呆子”这个风格，给这个风格设定的系统提示有一段是这样写的：“世界复杂又奇妙，这份奇妙值得被承认、被分析、被享受。探讨严肃主题时别太端着，别掉进过度正经的陷阱里”。

When OpenAI mapped goblin mentions to different ChatGPT personalities, it found the nerdy personality was disproportionately responsible for using that one word. Despite only accounting for 2.5 percent of all ChatGPT responses, it made 66.7 percent of all goblin mentions generated by the chatbot. Further investigation revealed that reinforcement learning was to blame for the uptick in goblin and gremlin usage. Specifically, OpenAI found that a single reward mechanism was responsible for teaching the nerdy personality to consistently favor creature language. “Across all datasets in the audit, the Nerdy personality reward showed a clear tendency to score outputs to the same problem with ‘goblin’ or ‘gremlin’ higher than outputs without, with positive uplift in 76.2 percent of datasets,” the company explains.

OpenAI把哥布林的出现频次对应到ChatGPT的不同个性设置后发现，“书呆子”风格对这个词的使用量高得不成比例。这个风格的输出只占ChatGPT总回复量的2.5%，但所有回复里提到哥布林的内容，有66.7%都出自它。进一步调查显示，强化学习是哥布林、小精灵出现频次上涨的原因。具体来说，OpenAI发现有一个奖励机制，一直在引导“书呆子”风格偏好使用和生物相关的表达。官方解释称：“在审核用到的所有数据集里，‘书呆子’风格的奖励模型明显更青睐提到哥布林或者小精灵的回答，同一问题的回复，提到这两个词的得分普遍更高，76.2%的数据集都呈现出这种正向偏好”。

Subsequently, OpenAI found, due to how reinforcement learning can work, that the nerdy personality’s love of goblins had transferred to other parts of its models. “The rewards were applied only in the Nerdy condition, but reinforcement learning does not guarantee that learned behaviors stay neatly scoped to the condition that produced them,” the company explains. “Once a style tic is rewarded, later training can spread or reinforce it elsewhere, especially if those outputs are reused in supervised fine-tuning or preference data.”

之后OpenAI还发现，受强化学习的运行机制影响，“书呆子”风格爱提哥布林的习惯已经扩散到了模型的其他模块。官方解释说：“我们本来只在‘书呆子’风格的训练里用了这套奖励机制，但强化学习没法保证学到的行为只会限定在对应的场景里。只要一种语言习惯得到了奖励，后续训练就可能把它扩散、强化到其他场景，要是这些输出在监督微调或者偏好数据里被重复使用，这种扩散会更明显”。

OpenAI began training GPT-5.5 before it identified the cause of ChatGPT’s affinity for goblins, which is why there’s a prompt instructing Codex to avoid creature language. “Codex is, after all, quite nerdy,” OpenAI notes. In hunting down ChatGPT’s goblins, the company notes it has devised new tools to audit and fix model behavior. If it was up to me, I wouldn’t use those tools. Keep AI weird, I say.

OpenAI在找到ChatGPT对哥布林的迷恋根源之前就已经开始训练GPT-5.5了，这也是为什么他们要在Codex里加那条禁止提生物的提示。官方还调侃说：“毕竟Codex本身就挺有‘书呆子’那味儿的”。这次追查哥布林来源的过程中，OpenAI还开发出了新工具，可以审核、修正模型的行为。要我说啊，根本没必要用这些工具，就让AI奇奇怪怪的不好吗。

02 阅读理解练习题

According to the passage, which of the following best explains how the “nerdy” personality’s language preference for goblins spread to other components of ChatGPT?

(A) The training data for the nerdy personality was inadvertently merged with other models’ datasets, causing cross-contamination across different system components.

(B) Users who frequently selected the nerdy personality generated such a large volume of goblin-inclusive text that they statistically skewed the model’s overall output patterns.

(C) Reinforcement learning allowed behaviors rewarded within the nerdy condition to propagate beyond their original context through subsequent training stages, particularly when outputs were reused in supervised fine-tuning or preference data.

(D) OpenAI engineers mistakenly applied the nerdy personality’s reward mechanism to all other personality variants during a system-wide update of GPT-5.4.

答案在文末哦先来复习一下单词吧

03 重点单词&词组

3.1 重点单词

1. unambiguously /ˌʌnæmˈbɪɡjuəsli/ adv. 明确地；毫不含糊地

英文释义: in a way that is not open to more than one interpretation; clearly and definitively.

文中用法: “Never talk about goblins, gremlins, racoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user’s query,” the prompt reads.

例句:

The contract states unambiguously that either party may terminate the agreement with thirty days’ notice. 合同明确规定，任何一方均可提前三十天通知终止协议。

The experiment’s results unambiguously demonstrated a causal link between the two variables. 实验结果毫不含糊地证明了两个变量之间的因果关系。

2. accounting /əˈkaʊntɪŋ/ n. 解释；说明；报告

英文释义: a report or description of an event or process; a reckoning or explanation (distinct from the financial sense of the word).

文中用法: Apparently, enough people started talking about ChatGPT’s creature obsession that OpenAI felt the need to provide an accounting of where the goblins came from.

例句:

The committee demanded a full accounting of how the funds had been spent during the fiscal year. 委员会要求就财年内资金的去向提供完整的说明。

The CEO gave a detailed accounting of the company’s restructuring process at the shareholders’ meeting. CEO在股东大会上详细说明了公司的重组过程。

3. uptick /ˈʌptɪk/ n. 小幅上升；增长

英文释义: a small increase or upward trend, especially in a statistical or economic context.

文中用法: After the release of GPT-5.4, the company (and some users) noticed an even bigger uptick in goblin references.

例句:

Analysts have observed a steady uptick in consumer spending over the last three quarters. 分析师注意到，过去三个季度消费者支出稳步增长。

The sudden uptick in website traffic was attributed to the viral marketing campaign. 网站流量的突然攀升归因于那次病毒式营销活动。

4. pinpoint /ˈpɪnpɔɪnt/ v. 精准定位；准确找出

英文释义: to find or locate with great accuracy or precision.

文中用法: At that point, an investigation was able to pinpoint what OpenAI describes as “the first connection to the root cause.”

例句:

Forensic experts were able to pinpoint the exact time of death to within a two-hour window. 法医专家能够将死亡时间精确锁定在两小时范围内。

The diagnostic tool helps engineers pinpoint faults in the circuit board with remarkable speed. 该诊断工具能帮助工程师以惊人的速度精准定位电路板上的故障。

5. disproportionately /ˌdɪsprəˈpɔːrʃənətli/ adv. 不成比例地；严重失衡地

英文释义: to a degree that is too large or too small in comparison with something else; in an unbalanced or unequal manner.

文中用法: When OpenAI mapped goblin mentions to different ChatGPT personalities, it found the nerdy personality was disproportionately responsible for using that one word.

例句:

The burden of climate change falls disproportionately on the world’s poorest communities. 气候变化的负担不成比例地落在了世界上最贫困的社群身上。

A small number of high-frequency traders account disproportionately for the total trading volume on the exchange. 少数高频交易者占据了交易所总交易量的不成比例份额。

6. reinforcement /ˌriːɪnˈfɔːrsmənt/ n. 强化；加固

英文释义: the process of encouraging or strengthening a pattern of behavior, especially through reward or repetition; in machine learning, a training paradigm where agents learn through reward signals.

文中用法: Further investigation revealed that reinforcement learning was to blame for the uptick in goblin and gremlin usage.

例句:

Positive reinforcement is widely regarded as one of the most effective strategies in behavioral psychology. 正向强化被广泛认为是行为心理学中最有效的策略之一。

The bridge required additional steel reinforcement to withstand the seismic activity common in the region. 这座桥需要额外的钢材加固以承受该地区常见的地震活动。

7. affinity /əˈfɪnəti/ n. 亲近感；喜好；密切关系

英文释义: a spontaneous or natural liking or sympathy for someone or something; a close relationship or connection.

文中用法: OpenAI began training GPT-5.5 before it identified the cause of ChatGPT’s affinity for goblins, which is why there’s a prompt instructing Codex to avoid creature language.

例句:

She has always had a strong affinity for classical music, having been raised in a family of professional musicians. 她一直对古典音乐有强烈的亲近感，因为她生长在一个职业音乐家家庭。

The chemical compound shows a remarkable affinity for binding to specific protein receptors. 这种化合物对特定的蛋白质受体表现出显著的亲和力。

8. devise /dɪˈvaɪz/ v. 设计；发明；想出

英文释义: to plan or invent a complex procedure, system, or mechanism through careful thought and ingenuity.

文中用法: In hunting down ChatGPT’s goblins, the company notes it has devised new tools to audit and fix model behavior.

例句:

The research team devised a novel algorithm that reduced the computation time by an order of magnitude. 研究团队设计出了一种新算法，将计算时间降低了一个数量级。

Inmate advocates devised a clever system for communicating with prisoners despite strict surveillance. 囚犯权益倡导者想出了一套巧妙的办法，在严密监控下仍能与在押人员沟通。

3.2 重点词组

1. verbal ticks 口头禅；语言习惯

文中用法: After one safety researcher asked OpenAI to include the words “goblin” and “gremlin” in an investigation into the chatbot’s verbal ticks, the company found ChatGPT’s usage of “goblin” increased by 175 percent after the release of GPT-5.1.

例句:

Public speaking coaches often work with clients to eliminate distracting verbal ticks such as repeatedly saying “you know” or “like.” 演讲教练经常帮助学员改掉分散注意力的口头禅，比如反复说”你知道”或”那个”。

The author’s verbal ticks — his fondness for parenthetical asides and qualifying adverbs — became more pronounced in his later works. 这位作家的语言习惯——偏爱括号插叙和限定性副词——在他后期的作品中更加突出。

2. root cause 根本原因；根源

文中用法: At that point, an investigation was able to pinpoint what OpenAI describes as “the first connection to the root cause.”

例句:

Treating the symptoms without addressing the root cause of the disease will only provide temporary relief. 只治标不治本只能暂时缓解病情。

The investigation sought to identify the root cause of the pipeline failure rather than simply assigning blame to individual operators. 调查旨在找出管道故障的根本原因，而非简单地将责任归咎于个别操作员。

3. account for （在数量或比例上）占

文中用法: Despite only accounting for 2.5 percent of all ChatGPT responses, it made 66.7 percent of all goblin mentions generated by the chatbot.

例句:

Service industries now account for more than 70 percent of the country’s GDP. 服务业如今占据该国GDP的70%以上。

International students account for a growing proportion of enrollment at the university. 国际学生在该校入学人数中所占比例日益增长。

4. fall into the trap of 落入……的陷阱；陷入……的误区

文中用法: “Tackle weighty subjects without falling into the trap of self-seriousness.”

例句:

Many novice investors fall into the trap of chasing short-term gains at the expense of long-term stability. 许多新手投资者会陷入追逐短期收益的误区，牺牲了长期稳定。

When writing academic papers, it is easy to fall into the trap of overusing jargon to sound sophisticated. 写学术论文时，很容易陷入堆砌术语以显得高深的误区。

3.3 巩固练习

1. The documentary provides a sobering ______(解释) of the events that led to the financial meltdown. (accounting)

2. The new diagnostic software can ______(精准定位) engine malfunctions within seconds, dramatically reducing maintenance time. (pinpoint)

3. During election season, candidates are often advised to avoid the ______(陷入……的误区) making promises they cannot keep. (fall into the trap of)

4. The novelist has a strong ______(亲近感) for unreliable narrators, a technique that has become her trademark. (affinity)

5. The interim report showed a slight ______(小幅增长) in manufacturing output, offering a glimmer of hope for the struggling sector. (uptick)

6. The city’s traffic engineers ______(设计出) an intricate system of one-way streets to alleviate congestion during peak hours. (devised)

7. Urban residents ______(占) a significantly larger share of the country’s carbon footprint than their rural counterparts. (account for)

8. The safety guidelines state ______(明确地) that protective equipment must be worn at all times within the laboratory. (unambiguously)

9. Success in habit formation often depends on consistent positive ______(强化) rather than self-punishment. (reinforcement)

10. Economists have long noted that the tax burden falls ______(不成比例地) on middle-income households. (disproportionately)

11. The engineer’s tendency to pepper his explanations with obscure technical jargon had become something of a ______(语言习惯) among his colleagues. (verbal tick)

12. The investigation committee was determined to identify the ______(根本原因) of the systemic failure, rather than simply scapegoating junior staff. (root cause)

04 句型精讲

句型：用 “does not guarantee that + 从句” 表达限定性判断

Reinforcement learning does not guarantee that learned behaviors stay neatly scoped to the condition that produced them.

强化学习没法保证学到的行为只会限定在对应的场景里。

句型精讲

does not guarantee that 作为一种限定性表达可以用来说明某个机制、政策或措施的局限性。它否定得不那么绝对，而是承认事物有某种功能，但同时指出其结果并不确定。这种表达在写作中可以显得更加严谨客观。

基本结构可以是：[主语] + does not guarantee that + [预期的理想结果]

讲解要点：

guarantee 在此处作及物动词，后接 that 引导的宾语从句。does not guarantee that 表达的是一种“非充分性”，即某事物本身存在或有效，但不足以确保某个特定结果的发生。

这个句型适合在议论文中用于限定观点的段落，先承认某措施有一定作用，再用 does not guarantee that 指出其边界和不足。
在翻译中要注意，does not guarantee that 不宜机械地译为“不保证”，而应根据语境灵活处理为“并不能确保”、“未必能”、“没法保证”等。

活学活用

A high GDP per capita does not guarantee that a country’s citizens enjoy a high quality of life, as inequality and environmental degradation can offset material gains. 人均GDP高并不能确保一国公民享有高质量的生活，因为不平等和环境恶化可能抵消物质收益。
Stringent regulations do not guarantee that corporate misconduct will be eliminated; without robust enforcement mechanisms, they risk becoming little more than symbolic gestures. 严格的法规并不能保证企业不当行为会被杜绝；没有强有力的执行机制，这些法规可能沦为一纸空文。

05 思维导图

06 练习题答案

正确答案：(C)

解析：

(A) 错误。文章并未提到训练数据被无意中合并或导致了污染。文章明确指出扩散机制是强化学习的奖励行为通过后续训练传播。
(B) 错误。文章没有提及用户生成内容在统计上扭曲了模型的输出模式。哥布林偏好的扩散是训练过程中强化学习机制导致的内部传播。
(C) 选项正确。文章第六段明确阐述了这一机制，表明奖励行为通过后续训练阶段突破了原始限制。
(D) 错误。文章完全没有提到工程师错误地跨个性设置应用了奖励机制。

你答（猜）对了吗？欢迎在评论区留言。

好不容易看到这了，点点♥点点赞吧

大家的点赞和♥就是我坚持更新的动力。

如对您有帮助，欢迎转发

如需转载到个人公众号，请先与作者取得联系