Are Students Learning with GenAI, or Just Completing Tasks?192 Reflections Reveal the Key to Better AI-Supported Learning
A Studies in Higher Education study shows that GenAI improves learning not simply when students use it, but when they critically evaluate, revise, and extend AI-generated outputs.
学生用AI是在学习,还是在完成任务?192份反思揭示AI教学设计的关键
一项 Studies in Higher Education 研究显示:AI真正提升学习的前提,不是“使用AI”,而是让学生批判、改造并扩展AI输出。
The most important question in AI education may not be whether students use AI. It may be whether AI becomes the end of thinking or the beginning of thinking. AI教育里真正重要的问题,可能不是学生有没有用AI,而是AI成为了思考的终点,还是思考的起点。 |
A student opens ChatGPT in the first week of a university course. The task is simple: ask GenAI to define a key course concept.
一个学生在大学课程第一周打开 ChatGPT。任务很简单:让生成式AI给一个课程核心概念下定义。
Twelve weeks later, the teacher does not ask the student to submit the AI answer. Instead, the student must write a new definition, compare it with the original AI version, and reflect on how their understanding has changed.
十二周后,老师没有让学生直接提交AI答案。相反,学生需要重新写出自己的定义,再把它与最初的AI版本进行比较,并反思自己的理解发生了什么变化。
This is a much more interesting design than a simple “ban AI or allow AI” policy. It treats GenAI as part of the learning environment and asks a sharper question: what kind of AI use actually leads to better learning?
这个设计比简单讨论“禁止AI还是允许AI”更有意思。它把生成式AI视为学习环境的一部分,并提出一个更尖锐的问题:到底哪一种AI使用方式,真的会带来更好的学习?
The answer from Pallant, Blijlevens, Campbell, and Jopp is clear: students learn more when they use GenAI to construct and augment knowledge. They learn less when they use it procedurally or simply reproduce its output.
Pallant、Blijlevens、Campbell 和 Jopp 给出的答案很清楚:当学生用生成式AI来建构和扩展知识时,学习效果更好;当学生只是程序化使用AI,或直接复述AI输出时,学习效果更差。
1. What did this study investigate? 一、这篇研究到底研究了什么? |
The paper, published in Studies in Higher Education, is titled “Mastering knowledge: the impact of generative AI on student learning outcomes.” It examines how GenAI influences student learning outcomes in higher education, rather than only discussing academic integrity or institutional policy.
这篇文章发表于 Studies in Higher Education,题为 “Mastering knowledge: the impact of generative AI on student learning outcomes”。它关注的不是单纯的学术诚信或学校政策,而是生成式AI如何影响高等教育中的学生学习结果。
The researchers embedded GenAI into real course tasks. In week 1, students used ChatGPT-3 to generate a definition of a course concept. After a 12-week semester, students wrote their own definition and compared it with the AI-generated version.
研究者把生成式AI嵌入真实课程任务中。学生在第1周使用 ChatGPT-3 生成课程概念定义;经过12周学习后,再写出自己的定义,并与AI生成的版本进行比较。
The final dataset included 192 student reflections, approximately 75,000 words in total. The authors used quantitative content analysis to code how students engaged with GenAI and then examined how these patterns related to grades and learning capabilities.
最终数据包括192份学生反思,总文本量约75,000词。作者采用定量内容分析(Quantitative Content Analysis, QCA)编码学生如何使用生成式AI,并进一步分析这些使用方式与成绩和学习能力之间的关系。
Research element 研究要素 | What the paper did 论文做法 |
Sample 样本 | 192 student reflections from three marketing-related university units 来自三门市场营销相关大学课程的192份学生反思 |
GenAI task AI任务 | Use ChatGPT-3 in week 1 to generate a definition of a course concept 第1周使用 ChatGPT-3 生成课程概念定义 |
Learning task 学习任务 | After 12 weeks, write an original definition and compare it with the AI version 12周后写出自己的定义,并与AI版本比较 |
Method 方法 | Quantitative content analysis, t-tests, and logistic regressions 定量内容分析、t检验和logistic回归 |
Main lens 核心理论 | Achievement Goals Framework, Zone of Proximal Development, Bloom and Fink taxonomies 成就目标框架、最近发展区、Bloom与Fink学习分类 |
This is not a paper about whether AI can write assignments. It is a paper about what happens when students use AI inside a designed learning process. 这不是一篇讨论“AI能不能写作业”的文章,而是一篇讨论学生在有设计的学习过程中使用AI会发生什么的文章。 |
2. The key distinction: mastery use versus performance use 二、关键分野:掌握式使用,还是表现式使用 |
The authors use the Achievement Goals Framework to interpret student engagement with GenAI. A mastery goal focuses on developing competence and understanding. A performance goal focuses on demonstrating competence, completing the task, or outperforming others.
作者借用成就目标框架来理解学生如何使用生成式AI。掌握目标(mastery goal)关注能力发展和真正理解;表现目标(performance goal)关注展示能力、完成任务或胜过他人。
In the GenAI context, this distinction becomes highly practical. A mastery-oriented student may treat AI as a draft to critique, extend, and rebuild. A performance-oriented student may treat AI as a ready-made answer to reproduce.
在生成式AI语境中,这一区分非常实际。掌握取向的学生可能把AI答案当作可以批判、扩展和重建的初稿;表现取向的学生则可能把AI答案当作可以直接复述的标准答案。
This is why “AI use” is too crude as a research variable. The same tool can support deep learning or shallow reproduction depending on the student’s goal structure and the assessment design around it.
因此,“是否使用AI”作为研究变量太粗糙了。同一个工具,既可能支持深度学习,也可能导致浅层复述,关键取决于学生的目标结构,以及围绕AI任务的评价设计。
AI engagement AI使用方式 | What students do 学生在做什么 | Learning orientation 学习取向 |
Constructing knowledge 建构知识 | Students actively build their own understanding from AI output 学生基于AI输出主动建构自己的理解 | Mastery 掌握取向 |
Augmenting knowledge 扩展知识 | Students add concepts, examples, critique, and disciplinary knowledge 学生补充概念、案例、批判和学科知识 | Mastery 掌握取向 |
Procedural use 程序化使用 | Students follow steps but show little learning beyond the required procedure 学生按流程完成任务,但没有体现超出流程的学习 | Performance 表现取向 |
Regurgitating knowledge 复述知识 | Students reproduce AI information without sufficient understanding or critique 学生缺乏充分理解和批判,直接复述AI信息 | Performance 表现取向 |
3. Finding one: students who build with AI score higher 三、发现一:会“改造AI”的学生成绩更高 |
The most direct finding is about grades. Students who took a constructive approach to GenAI had higher overall marks and assignment marks than students who did not.
最直接的结果体现在成绩上。采用建构式AI使用方式的学生,其整体成绩和作业成绩都高于没有采用这种方式的学生。
For the constructive approach group, the average overall mark was 76, compared with 69.8 for the non-constructive group. The assignment mark showed a similar pattern: 74.3 versus 68.8.
建构式使用组的整体成绩平均为76分,非建构式使用组为69.8分。作业成绩也呈现类似差异:74.3分对68.8分。
The opposite pattern appeared for procedural use. Students who used GenAI procedurally had lower overall and assignment marks than students who did not show this procedural pattern.
相反,程序化使用AI的学生成绩更低。采用程序化方式使用AI的学生,其整体成绩和作业成绩都低于没有表现出这种程序化使用方式的学生。
Approach 使用方式 | Overall mark 整体成绩 | Assignment mark 作业成绩 | Plain-language meaning 通俗解释 |
Constructive vs. non-constructive 建构式 vs 非建构式 | 76 vs. 69.8 | 74.3 vs. 68.8 | Building on AI was associated with higher marks. 会改造AI的学生成绩更高。 |
Procedural vs. non-procedural 程序化 vs 非程序化 | 69.8 vs. 76 | 68.8 vs. 74.4 | Using AI only as a procedure was associated with lower marks. 只按流程用AI的学生成绩更低。 |
Augmentative vs. non-augmentative 扩展式 vs 非扩展式 | 74.4 vs. 71 | Not significant in the paper 论文未报告显著作业差异 | Extending AI output was associated with higher overall marks. 扩展AI输出与更高整体成绩相关。 |
Regurgitative vs. non-regurgitative 复述式 vs 非复述式 | 71 vs. 74.6 | Not reported as significant 未报告为显著 | Repeating AI output was associated with lower overall marks. 复述AI内容与更低整体成绩相关。 |
The result does not say “AI use is good” or “AI use is bad.” It says that the quality of cognitive engagement around AI matters. 这组结果不是在说“用AI好”或“用AI不好”,而是在说:围绕AI发生的认知加工质量才是关键。 |
4. Finding two: the real difference lies in higher-order learning 四、发现二:真正的差异在高阶学习能力 |
The more important contribution of the paper is not only the grade difference. The authors also examined learning capabilities, including applied knowledge, learning autonomy, and critical thinking.
这篇文章更重要的贡献不只是成绩差异。作者进一步分析了学习能力,包括应用知识、自主学习和批判性思维。
Students who used GenAI to augment knowledge were much more likely to demonstrate all three capabilities. The difference was especially large for critical thinking: 0.943 in the augmented group compared with 0.314 in the non-augmented group, with an odds ratio of 35.782.
用生成式AI扩展知识的学生,更可能表现出这三种能力。其中批判性思维差异尤其明显:扩展式使用组为0.943,非扩展式使用组为0.314,对应 odds ratio 为35.782。
In contrast, regurgitating AI-generated knowledge was negatively associated with applied knowledge, learning autonomy, and critical thinking. In simple terms, repeating AI output is not the same as learning from AI.
相反,复述AI生成内容与应用知识、自主学习和批判性思维均呈负向关系。简单说,复述AI输出,并不等于从AI中学习。
Learning capability 学习能力 | Augmented 扩展式 | Non-augmented 非扩展式 | Odds ratio 优势比 | p-value p值 |
Applied knowledge 应用知识 | 0.966 | 0.667 | 14.000 | <0.01 |
Learning autonomy 自主学习 | 0.966 | 0.619 | 17.231 | <0.01 |
Critical thinking 批判性思维 | 0.943 | 0.314 | 35.782 | <0.001 |
This point is important for educational technology research. If we only measure whether students used AI, or whether their immediate task performance improved, we may miss the real learning mechanism.
这一点对教育技术研究很重要。如果我们只测量学生有没有使用AI,或者只看他们当下任务表现是否提升,就可能错过真正的学习机制。
The learning mechanism is not the tool itself. It is the way students evaluate, contextualize, critique, and reconstruct the output generated by the tool.
真正的学习机制不是工具本身,而是学生如何评价、情境化、批判和重构工具生成的内容。
5. Why does this happen? GenAI changes the Zone of Proximal Development 五、为什么会这样?生成式AI改变了最近发展区 |
The paper also interprets GenAI through Vygotsky’s Zone of Proximal Development. GenAI can act like a more knowledgeable source that helps students move beyond what they can do alone.
论文还借用维果茨基的最近发展区(Zone of Proximal Development, ZPD)来解释生成式AI。生成式AI可以像一个更有知识的支持者,帮助学生超出自己独立完成任务的水平。
But scaffolding only works when the learner remains cognitively active. If GenAI simply fills in the answer and the student accepts it at face value, the tool may keep the student inside their existing zone of competence.
但脚手架只有在学习者保持认知主动性时才有效。如果生成式AI只是填入答案,而学生照单全收,那么工具可能反而让学生停留在既有能力范围内。
This also connects to Bloom’s revised taxonomy. Students using AI constructively appeared to move toward application, analysis, evaluation, and possibly creation. Students who only regurgitated AI output remained closer to remembering and understanding.
这也可以与Bloom修订版学习分类联系起来。建构式使用AI的学生似乎更接近应用、分析、评价,甚至创造;而只是复述AI输出的学生,则更接近记忆和理解等低阶层次。
GenAI can be a scaffold. But a scaffold is not a lift. It supports climbing; it should not replace climbing. 生成式AI可以是脚手架,但脚手架不是电梯。它应该支持攀爬,而不是替代攀爬。 |
6. Implications for assessment design: do not just regulate AI use; design AI engagement 六、对评价设计的启示:不要只规定能不能用AI,而要设计如何用AI |
For teachers and curriculum designers, the practical implication is clear. A policy that says “AI is allowed” or “AI is not allowed” is not enough. The assessment must shape the kind of AI engagement students are expected to demonstrate.
对教师和课程设计者来说,实践启示很清楚。仅仅规定“允许AI”或“不允许AI”是不够的。评价任务必须塑造学生需要展示的AI使用方式。
If an assignment only asks for a final answer, GenAI naturally becomes an answer machine. If an assignment asks students to preserve the AI draft, critique it, improve it, and explain their reasoning, GenAI becomes part of the learning process.
如果作业只要求提交最终答案,生成式AI自然会变成答案机器。相反,如果作业要求学生保留AI初稿、批判AI输出、改进AI答案并解释自己的推理过程,生成式AI才会成为学习过程的一部分。
A better GenAI assessment does not ask students to hide AI. It asks students to make their thinking visible around AI.
更好的生成式AI评价任务,不是让学生隐藏AI,而是让学生围绕AI展示自己的思考。
Design step 设计步骤 | Student action 学生行动 | Learning purpose 学习目的 |
Generate 生成 | Keep the original AI output. 保留AI原始输出。 | Make the starting point visible. 让学习起点可见。 |
Critique 批判 | Identify what is accurate, missing, shallow, or questionable. 指出准确、遗漏、浅层或可疑之处。 | Move beyond passive acceptance. 超越被动接受。 |
Reconstruct 重建 | Use course concepts, cases, and evidence to rewrite the answer. 用课程概念、案例和证据重写答案。 | Turn AI output into disciplinary understanding. 把AI输出转化为学科理解。 |
Reflect 反思 | Explain how their understanding changed over time. 解释自己的理解如何变化。 | Connect AI use with metacognition. 把AI使用与元认知联系起来。 |
7. Implications for AI education research 七、对AI教育研究的启示 |
For researchers, this paper is a useful reminder that “AI use” should not be treated as a simple yes-or-no variable. Future studies need to code the quality of interaction between students and AI.
对研究者来说,这篇文章提醒我们:“是否使用AI”不应被当作简单的二分变量。未来研究需要编码学生与AI互动的质量。
The paper also reminds us to distinguish between task completion and learning. A student may complete a task quickly with AI, but the deeper question is whether they can demonstrate applied knowledge, autonomy, and critical thinking afterward.
这篇文章也提醒我们区分任务完成与学习发生。学生可能借助AI快速完成任务,但更深层的问题是:他们之后是否能表现出应用知识、自主学习和批判性思维。
Methodologically, the study is not a randomized controlled trial, and the authors are careful about this limitation. Goal structures were inferred through coded reflections rather than directly manipulated. The sample also came from marketing units, which means the findings should be extended cautiously to other disciplines.
方法上,这项研究不是随机对照实验,作者也清楚说明了这一限制。学生的目标结构是通过反思文本编码推断出来的,而不是被直接操纵的。样本也来自市场营销相关课程,因此把结果推广到其他学科时需要谨慎。
Still, the study offers a strong conceptual and methodological direction: measure not only whether students use GenAI, but how they transform GenAI output into their own knowledge.
尽管如此,这项研究仍然提供了很有价值的概念和方法方向:不仅要测量学生是否使用生成式AI,更要测量他们如何把AI输出转化为自己的知识。
8. Q&A: what exactly is procedural use of GenAI? 八、Q&A:程序化使用AI到底是什么意思? |
Q1. What does “procedural use of GenAI” mean? Q1. “程序化使用AI”是什么意思? |
A more reader-friendly translation may be “using AI to complete the required steps,” rather than “procedural use.” The point is not that the student did not use AI, nor that the student necessarily cheated.
更适合公众号读者的说法,可能不是“程序化使用AI”,而是“按流程完成任务式使用AI”。重点不是学生没有用AI,也不是说学生一定作弊。
The point is that the student may have followed every instruction correctly, but the response does not show learning beyond the required procedure.
重点在于:学生可能每一步都按要求做了,但答案没有展示出超出任务流程本身的学习。
For example, a student may ask ChatGPT for a definition, compare it with their own definition, and submit a reflection. However, if the reflection only says that the AI answer is “mostly correct” and does not explain what is missing, why it is shallow, or how course concepts changed their understanding, then AI has helped them complete the task, not deepen the knowledge. 例如,一个学生可能让 ChatGPT 生成定义,再把它与自己的定义比较,最后提交一段反思。但如果这段反思只是说“AI答案基本正确”,却没有解释AI遗漏了什么、哪里过于浅层、课程概念如何改变了自己的理解,那么AI只是帮他走完了任务流程,而没有真正把知识走深。 |
Use pattern / 使用方式 | What it looks like / 具体表现 | Learning signal / 学习信号 |
Procedural use / 按流程完成任务式使用AI | The student follows the required steps: generates an AI definition, compares it with their own answer, and writes a reflection. 学生按要求完成步骤:生成AI定义、与自己的答案比较,并写一段反思。 | The task is completed, but the response shows little learning beyond the procedure. 任务完成了,但答案没有展示出超出流程本身的学习。 |
Constructive / augmentative use / 建构式或扩展式使用AI | The student identifies what AI got right, what it missed, and uses course concepts, examples, or evidence to rebuild the answer. 学生指出AI哪里对、哪里遗漏,并用课程概念、案例或证据重建答案。 | The student transforms AI output into disciplinary understanding. 学生把AI输出转化为自己的学科理解。 |
Q2. Is this how the original paper explains it? Q2. 原文里也是这么解释的吗? |
Yes, the overall interpretation is consistent with the paper. The abstract states that lower-level learning outcomes resulted when students used GenAI procedurally without augmenting knowledge.
是的,这个大方向与原文一致。论文摘要中明确说,当学生只是程序化使用生成式AI,而没有进一步扩展自己的知识时,会出现较低层次的学习结果。
The appendix is even more precise. It defines a procedural or performative approach as a case where instructions for GenAI use are followed strictly and without errors, but the student does not display learning beyond what is exactly required.
附录里的定义更精确。论文把 procedural / performative approach 定义为:学生严格、无误地按照 GenAI 使用要求完成任务,但没有展示出超出“任务要求本身”的学习。
So the explanation above is faithful to the paper, but the simplified example is not a verbatim student quote. It is a teaching example created to make the coding logic easier to understand.
所以,上面的解释是贴近原文的;但那个简化例子不是论文里的学生原句,而是为了让读者更容易理解编码逻辑而改写出来的教学化例子。
A rigorous way to write this in the article is: “In plain language, procedural use means completing the AI-related steps correctly without transforming AI output into one’s own understanding.” 更严谨的写法是:“通俗地说,程序化使用AI,就是正确完成了与AI相关的任务步骤,但没有把AI输出转化成自己的理解。” |
9. So what? 九、所以呢? |
For AI education, the most useful takeaway from this paper is not a single number. It is a shift in the question we ask.
对AI教育来说,这篇文章最值得带走的不是某一个数字,而是问题意识的转变。
Do not only ask: Did the student use AI? Ask: Did the student question AI? Did they add disciplinary knowledge? Did they identify what AI missed? Did they rebuild the output into a better explanation?
不要只问:学生有没有用AI?更要问:学生有没有质疑AI?有没有补充学科知识?有没有识别AI遗漏了什么?有没有把AI输出重建成更好的解释?
If the answer is no, GenAI may simply help students stay in shallow learning more efficiently. If the answer is yes, GenAI may become a genuine scaffold for mastery.
如果答案是否定的,生成式AI可能只是让学生更高效地停留在浅层学习。如果答案是肯定的,生成式AI才可能真正成为通向掌握的脚手架。
The future of AI education should not be built around the fantasy that students will stop using AI. It should be built around better tasks, better rubrics, and better evidence of thinking.
AI教育的未来,不应建立在学生会停止使用AI的幻想之上,而应建立在更好的任务、更好的评分标准,以及更清晰的思考证据之上。
Discussion question: When you design an AI-related assignment, do you care more about whether students used AI, or how they transformed AI output? 讨论问题:当你设计一个与AI相关的作业时,你更关心学生“有没有用AI”,还是更关心学生“如何改造AI输出”? |
If this article helped you see GenAI learning from a more evidence-based angle, you are welcome to follow this account for more paper-based stories about AI education, learning science, and educational technology. 如果这篇文章让你从更实证的角度理解生成式AI学习,欢迎关注本公众号。这里会继续用学术论文讲AI教育、学习科学与教育技术中的真实问题。 |
Reference 参考文献 |
Pallant, J. L., Blijlevens, J., Campbell, A., & Jopp, R. (2026). Mastering knowledge: The impact of generative AI on student learning outcomes. Studies in Higher Education, 51(4), 714–735. https://doi.org/10.1080/03075079.2025.2487570
#AI教育#教育技术#生成式AI#高等教育#学习科学#学习设计#ChatGPT#人工智能教育#教学设计#教育研究
夜雨聆风