总结了经常出现在各类文章和新闻中的AI专业术语,除了掌握他们的中英文对照说法,还需要了解它本身的内涵。否则如果把它放到篇章或者听力中,会影响你对全文的理解。
Artificial intelligence (AI)
人工智能(AI)
Computer systems that perform tasks that historically required human judgment, such as recognizing patterns, making predictions, or generating content.
指能够完成传统上需要人类判断类任务的计算机系统,例如模式识别、预测分析或内容生成。
AI model
人工智能模型
A mathematical system trained on data that can produce outputs (predictions or generated content) from new inputs.
依托数据训练形成的数学系统,可基于全新输入信息生成输出结果,包括预测结论与创作内容。
Algorithm
算法
A step-by-step computational method. In AI, algorithms are used to train models and to produce outputs from inputs.
一套循序渐进的计算执行方法。在人工智能领域,算法用于模型训练,并依托输入数据生成对应输出。
Anomaly detection
异常检测
Methods that flag unusual patterns in data (e.g., unusual network traffic or shipping patterns) that may merit investigation.
用于识别数据中异常特征的技术手段,如异常网络流量、非常规运输轨迹等,标记潜在风险并提示核查。
Bias
偏见 / 偏差
Systematic errors or differences in performance across groups or contexts, often reflecting patterns and/or gaps in training data or design choices.
模型在不同群体、不同场景下出现的系统性误差或表现差异,通常源于训练数据缺陷、数据缺口或设计局限。
Black box
黑盒
A system whose internal reasoning is difficult to inspect or explain, even if its inputs and outputs are visible.
输入与输出可见,但内部运算逻辑、决策推理过程难以核查与解释的系统。
Classification
分类任务
A prediction task where the model assigns an input to a category (e.g., spam/not spam; high risk/low risk; or dog/cat/sheep/snake).
一种预测类任务,模型将输入内容自动划分至指定类别,如垃圾邮件 / 正常邮件、高风险 / 低风险、物种分类等。
Context
上下文语境
All information provided to a language model for a specific response. This usually includes the user message plus additional instructions, conversation history, or retrieved documents.
大模型生成单次回答所依托的全部信息,包含用户提问、附加指令、对话历史及检索参考文档。
Deep learning
深度学习
A machine learning approach that learns complex patterns directly from large amounts of raw data by adjusting many internal parameters (“connections”).
机器学习的重要分支,通过调整海量内部参数,直接从海量原始数据中自主学习复杂规律与特征。
Deepfake
深度伪造
Synthetic or manipulated media (image, audio, or video) designed to look authentic, often used for deception or influence.
经过合成、篡改的图像、音频、视频等多媒体内容,仿真度极高,常被用于虚假误导与舆论操纵。
Developer instructions
开发者指令
Rules provided by the organization building the system that shape tone, formatting, and policy constraints (e.g., how to handle sensitive content).
系统研发方设定的规则规范,用于限定输出语气、内容格式及合规要求,明确敏感内容处理准则。
Fine-tuning (supervised)
(有监督)微调
Additional training on labeled examples to make a general model more reliable for specific tasks.
利用带标注样本对通用基础模型开展二次训练,提升模型在特定业务场景下的精准度与适配性。
Guardrails
安全防护机制
Controls applied before or after model generation to reduce harmful outputs or misuse (e.g., blocking certain inputs, redacting sensitive strings, enforcing logging).
在模型生成前后设置的管控措施,用于规避有害内容、防范滥用,包含输入拦截、敏感信息脱敏、操作留痕等。
Hallucination
模型幻觉
When a language model produces plausible-sounding but factually incorrect or fabricated content.
大模型生成内容看似逻辑通顺、合理自然,但事实错误、凭空编造虚假信息的现象。
Independent learning
自主学习
In this primer’s framing, systems (typically deep learning) that learn patterns directly from data rather than relying on explicit rules or handpicked variables.
本术语定义中指无需固定规则、人工筛选变量,直接依托数据自主挖掘规律的技术体系,以深度学习为典型代表。
Instruction tuning
指令微调
A training step where a language model learns to follow instructions by training on many examples of tasks and correct responses.
大模型专项训练环节,通过海量任务示例与标准回答样本,让模型学会精准理解并服从人类指令。
Large language model (LLM)
大语言模型(LLM)
A deep learning model trained on large text corpora to generate and interpret text, typically by predicting the next token.
基于海量文本语料训练的深度学习模型,核心通过预测下一个文本单元,实现文本理解与内容生成。
Machine learning (ML)
机器学习(ML)
A subset of AI where models learn patterns from data rather than being explicitly programmed with fixed rules.
人工智能的核心分支,模型无需硬性固定编程规则,可自主从数据中总结规律、迭代优化。
Model drift
模型漂移
Performance changes over time because real-world data, behavior, or conditions shift away from what the model was trained on.
受现实数据、用户行为、环境条件变化影响,实际场景与训练样本产生偏差,导致模型性能逐步下降。
Natural language processing (NLP)
自然语言处理(NLP)
Methods that enable computers to work with human language (classification, extraction, summarization, translation, generation).
使计算机读懂、处理人类自然语言的技术集合,涵盖文本分类、信息提取、摘要、翻译、内容生成等能力。
Next-token prediction
下一词元预测
The core training objective for many language models: predict the most likely next token given preceding text.
主流大语言模型的核心训练目标:基于前文上下文,概率预测最有可能出现的下一个文本单元。
Output constraints
输出约束
Formatting or policy requirements placed on model outputs (e.g., “respond in bullet points,” “include citations,” “use this template”).
对模型输出内容的格式规范与合规限制,例如要点式回答、强制标注引用、固定模板输出等。
Pattern matching (classical machine learning)
模式匹配(传统机器学习)
In this primer’s framing, statistical models that learn relationships between selected input variables and outcomes to make predictions.
本术语定义中指传统统计模型,通过分析人工筛选的输入变量与结果之间的关联规律,完成预测任务。
Prediction
预测
An estimate produced by a model about an unknown label or future outcome (e.g., risk score, likely delay, likely category).
模型针对未知结果、未来态势给出的评估判断,如风险评分、延误预判、类别判定等。
Preference training/RLHF
偏好训练 / 基于人类反馈的强化学习(RLHF)
A training approach where the model learns to produce answers preferred by human evaluators, often using reinforcement learning methods.
依托人工评价偏好,结合强化学习技术,引导模型输出更贴合人类价值观与使用习惯的回答。
Prompt
提示词 / 指令
The user’s input text to the system (often combined with additional context and instructions before reaching the model).
用户向 AI 系统输入的文本内容,通常会结合上下文与附加指令,共同作为模型生成回答的依据。
Reinforcement learning (RL)
强化学习(RL)
A learning approach where a system learns strategies by trial and error using feedback (“rewards”) from an environment or evaluator.
一种学习机制,系统通过不断试错,结合环境与评价方给出的奖励反馈,持续优化决策策略。
Retrieval-Augmented Generation (RAG)
检索增强生成(RAG)
A system design where the model is provided with retrieved reference material (documents, databases, web sources) to ground its answers.
一种 AI 架构设计,通过实时检索文档、数据库、网络资料等外部参考信息,保障回答内容真实有据。
Rule-based / symbolic system
基于规则 / 符号化系统
Software that follows explicit human-written rules (e.g., “if X, then do Y”), often easier to audit but less flexible with ambiguity.
完全遵循人工编写固定逻辑规则运行的程序,可追溯、易审计,但面对模糊场景与复杂问题时灵活性不足。
Supervised learning
监督学习
Training a model on labeled examples where the correct answer is known (e.g., approved/rejected; high risk/low risk).
利用已知标准答案的标注样本训练模型,如结果通过 / 驳回、风险等级划分等分类训练场景。
Token
词元 / 令牌
A unit of text used by language models (may be a whole word, part of a word, punctuation, or whitespace).
大模型处理文本的最小单位,可以是完整单词、词根片段、标点符号或空格。
Tool use
工具调用
A system setup where a language model can call external tools (search, databases, calculators) and incorporate the results into its response.
大模型的拓展能力,可自主调用搜索引擎、数据库、计算器等外部工具,并整合工具结果完善回答。
Transparency
透明度
Operational clarity about what an AI system does, what data it uses, its limitations, and who is accountable for decisions influenced by it.
AI 系统在运行逻辑、数据来源、能力边界、决策责任划分等方面的公开性与可解释性。
Unsupervised learning
无监督学习
Training or analysis on unlabeled data to find structure (e.g., clustering themes in documents, detecting unusual patterns).
无需标注数据,通过自主分析挖掘数据内在结构与关联,典型应用包括文本主题聚类、异常特征识别。
Validation (human review)
人工审核核验
A workflow where a person checks AI outputs before they are relied upon, especially in high-stakes contexts.
关键业务场景下的风控流程,在采用 AI 生成内容前,由人工逐项审核校验,规避决策风险。
夜雨聆风