From Superficial Outputs to Superficial Learning: Risks of Large Language Models in Education
- URL: http://arxiv.org/abs/2509.21972v2
- Date: Mon, 03 Nov 2025 02:14:57 GMT
- Title: From Superficial Outputs to Superficial Learning: Risks of Large Language Models in Education
- Authors: Iris Delikoura, Yi. R Fung, Pan Hui,
- Abstract summary: Large Language Models (LLMs) are transforming education by enabling personalization, feedback, and knowledge access.<n>Yet empirical evidence on these risks remains fragmented.<n>This paper presents a systematic review of 70 empirical studies across computer science, education, and psychology.
- Score: 14.320271477960192
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) are transforming education by enabling personalization, feedback, and knowledge access, while also raising concerns about risks to students and learning systems. Yet empirical evidence on these risks remains fragmented. This paper presents a systematic review of 70 empirical studies across computer science, education, and psychology. Guided by four research questions, we examine: (i) which applications of LLMs in education have been most frequently explored; (ii) how researchers have measured their impact; (iii) which risks stem from such applications; and (iv) what mitigation strategies have been proposed. We find that research on LLMs clusters around three domains: operational effectiveness, personalized applications, and interactive learning tools. Across these, model-level risks include superficial understanding, bias, limited robustness, anthropomorphism, hallucinations, privacy concerns, and knowledge constraints. When learners interact with LLMs, these risks extend to cognitive and behavioural outcomes, including reduced neural activity, over-reliance, diminished independent learning skills, and a loss of student agency. To capture this progression, we propose an LLM-Risk Adapted Learning Model that illustrates how technical risks cascade through interaction and interpretation to shape educational outcomes. As the first synthesis of empirically assessed risks, this review provides a foundation for responsible, human-centred integration of LLMs in education.
Related papers
- The Landscape of Memorization in LLMs: Mechanisms, Measurement, and Mitigation [97.0658685969199]
Large Language Models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks, yet they also exhibit memorization of their training data.<n>This paper synthesizes recent studies and investigates the landscape of memorization, the factors influencing it, and methods for its detection and mitigation.
arXiv Detail & Related papers (2025-07-08T01:30:46Z) - Persuasion with Large Language Models: a Survey [49.86930318312291]
Large Language Models (LLMs) have created new disruptive possibilities for persuasive communication.
In areas such as politics, marketing, public health, e-commerce, and charitable giving, such LLM Systems have already achieved human-level or even super-human persuasiveness.
Our survey suggests that the current and future potential of LLM-based persuasion poses profound ethical and societal risks.
arXiv Detail & Related papers (2024-11-11T10:05:52Z) - Quantifying Risk Propensities of Large Language Models: Ethical Focus and Bias Detection through Role-Play [4.343589149005485]
As Large Language Models (LLMs) become more prevalent, concerns about their safety, ethics, and potential biases have risen.<n>This study innovatively applies the Domain-Specific Risk-Taking (DOSPERT) scale from cognitive science to LLMs.<n>We propose a novel Ethical Decision-Making Risk Attitude Scale (EDRAS) to assess LLMs' ethical risk attitudes in depth.
arXiv Detail & Related papers (2024-10-26T15:55:21Z) - When Machine Unlearning Meets Retrieval-Augmented Generation (RAG): Keep Secret or Forget Knowledge? [15.318301783084681]
Large language models (LLMs) can inadvertently learn and retain sensitive information and harmful content during training.
We propose a lightweight unlearning framework based on Retrieval-Augmented Generation (RAG) technology.
We evaluate our framework through extensive experiments on both open-source and closed-source models, including ChatGPT, Gemini, Llama-2-7b-chat-hf, and PaLM 2.
arXiv Detail & Related papers (2024-10-20T03:51:01Z) - Risks, Causes, and Mitigations of Widespread Deployments of Large Language Models (LLMs): A Survey [0.0]
Large Language Models (LLMs) have transformed Natural Language Processing (NLP) with their outstanding abilities in text generation, summarization, and classification.
Their widespread adoption introduces numerous challenges, including issues related to academic integrity, copyright, environmental impacts, and ethical considerations such as data bias, fairness, and privacy.
This paper offers a comprehensive survey of the literature on these subjects, systematically gathered and synthesized from Google Scholar.
arXiv Detail & Related papers (2024-08-01T21:21:18Z) - Beyond Human Norms: Unveiling Unique Values of Large Language Models through Interdisciplinary Approaches [69.73783026870998]
This work proposes a novel framework, ValueLex, to reconstruct Large Language Models' unique value system from scratch.
Based on Lexical Hypothesis, ValueLex introduces a generative approach to elicit diverse values from 30+ LLMs.
We identify three core value dimensions, Competence, Character, and Integrity, each with specific subdimensions, revealing that LLMs possess a structured, albeit non-human, value system.
arXiv Detail & Related papers (2024-04-19T09:44:51Z) - Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning [61.2224355547598]
Open-sourcing of large language models (LLMs) accelerates application development, innovation, and scientific progress.
Our investigation exposes a critical oversight in this belief.
By deploying carefully designed demonstrations, our research demonstrates that base LLMs could effectively interpret and execute malicious instructions.
arXiv Detail & Related papers (2024-04-16T13:22:54Z) - Rethinking Machine Unlearning for Large Language Models [85.92660644100582]
We explore machine unlearning in the domain of large language models (LLMs)<n>This initiative aims to eliminate undesirable data influence (e.g., sensitive or illegal information) and the associated model capabilities.
arXiv Detail & Related papers (2024-02-13T20:51:58Z) - Exploring the Cognitive Knowledge Structure of Large Language Models: An
Educational Diagnostic Assessment Approach [50.125704610228254]
Large Language Models (LLMs) have not only exhibited exceptional performance across various tasks, but also demonstrated sparks of intelligence.
Recent studies have focused on assessing their capabilities on human exams and revealed their impressive competence in different domains.
We conduct an evaluation using MoocRadar, a meticulously annotated human test dataset based on Bloom taxonomy.
arXiv Detail & Related papers (2023-10-12T09:55:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.