Related papers: Unreflected Acceptance -- Investigating the Negative Consequences of ChatGPT-Assisted Problem Solving in Physics Education

Unreflected Acceptance -- Investigating the Negative Consequences of ChatGPT-Assisted Problem Solving in Physics Education

URL: http://arxiv.org/abs/2309.03087v1
Date: Mon, 21 Aug 2023 16:14:34 GMT
Title: Unreflected Acceptance -- Investigating the Negative Consequences of ChatGPT-Assisted Problem Solving in Physics Education
Authors: Lars Krupp, Steffen Steinert, Maximilian Kiefer-Emmanouilidis, Karina E. Avila, Paul Lukowicz, Jochen Kuhn, Stefan K\"uchemann, Jakob Karolus
Abstract summary: The impact of large language models (LLMs) on sensitive areas of everyday life, such as education, remains unclear. Our work focuses on higher physics education and examines problem solving strategies.
Score: 4.014729339820806
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) have recently gained popularity. However, the impact of their general availability through ChatGPT on sensitive areas of everyday life, such as education, remains unclear. Nevertheless, the societal impact on established educational methods is already being experienced by both students and educators. Our work focuses on higher physics education and examines problem solving strategies. In a study, students with a background in physics were assigned to solve physics exercises, with one group having access to an internet search engine (N=12) and the other group being allowed to use ChatGPT (N=27). We evaluated their performance, strategies, and interaction with the provided tools. Our results showed that nearly half of the solutions provided with the support of ChatGPT were mistakenly assumed to be correct by the students, indicating that they overly trusted ChatGPT even in their field of expertise. Likewise, in 42% of cases, students used copy & paste to query ChatGPT -- an approach only used in 4% of search engine queries -- highlighting the stark differences in interaction behavior between the groups and indicating limited reflection when using ChatGPT. In our work, we demonstrated a need to (1) guide students on how to interact with LLMs and (2) create awareness of potential shortcomings for users.

Related papers

Can Large Language Models Help Students Prove Software Correctness? An Experimental Study with Dafny [79.56218230251953]
Students in computing education increasingly use large language models (LLMs) such as ChatGPT.<n>This paper investigates how students interact with an LLM when solving formal verification exercises in Dafny.
arXiv Detail & Related papers (2025-06-27T16:34:13Z)
Investigating Middle School Students Question-Asking and Answer-Evaluation Skills When Using ChatGPT for Science Investigation [18.913112043551045]
Generative AI (GenAI) tools such as ChatGPT allow users to explore and address a wide range of tasks.<n>This study examines middle school students ability to ask effective questions and critically evaluate ChatGPT responses.
arXiv Detail & Related papers (2025-05-02T08:38:17Z)
The Future of Learning: Large Language Models through the Lens of Students [20.64319102112755]
Students grapple with the dilemma of utilizing ChatGPT's efficiency for learning and information seeking. Students perceive ChatGPT as being more "human-like" compared to traditional AI.
arXiv Detail & Related papers (2024-07-17T16:40:37Z)
Investigation of the effectiveness of applying ChatGPT in Dialogic Teaching Using Electroencephalography [6.34494999013996]
Large language models (LLMs) possess the capability to interpret knowledge, answer questions, and consider context. This research recruited 34 undergraduate students as participants, who were randomly divided into two groups. The experimental group engaged in dialogic teaching using ChatGPT, while the control group interacted with human teachers.
arXiv Detail & Related papers (2024-03-25T12:23:12Z)
Exploring the Impact of ChatGPT on Student Interactions in Computer-Supported Collaborative Learning [1.5961625979922607]
This paper takes an initial step in exploring the applicability of ChatGPT in a computer-supported collaborative learning environment. Using statistical analysis, we validate the shifts in student interactions during an asynchronous group brainstorming session by introducing ChatGPT as an instantaneous question-answering agent.
arXiv Detail & Related papers (2024-03-11T18:18:18Z)
"It's not like Jarvis, but it's pretty close!" -- Examining ChatGPT's Usage among Undergraduate Students in Computer Science [3.6936132187945923]
Large language models (LLMs) such as ChatGPT and Google Bard have garnered significant attention in the academic community. This study adopts a student-first approach to comprehensively understand how undergraduate computer science students utilize ChatGPT.
arXiv Detail & Related papers (2023-11-16T08:10:18Z)
Primacy Effect of ChatGPT [69.49920102917598]
We study the primacy effect of ChatGPT: the tendency of selecting the labels at earlier positions as the answer. We hope that our experiments and analyses provide additional insights into building more reliable ChatGPT-based solutions.
arXiv Detail & Related papers (2023-10-20T00:37:28Z)
Transformative Effects of ChatGPT on Modern Education: Emerging Era of AI Chatbots [36.760677949631514]
ChatGPT was released to provide coherent and useful replies based on analysis of large volumes of data. Our preliminary evaluation concludes that ChatGPT performed differently in each subject area including finance, coding and maths. There are clear drawbacks in its use, such as the possibility of producing inaccurate or false data. Academic regulations and evaluation practices need to be updated, should ChatGPT be used as a tool in education.
arXiv Detail & Related papers (2023-05-25T17:35:57Z)
ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning [70.57126720079971]
Large language models (LLMs) have emerged as the most important breakthroughs in natural language processing (NLP) This paper evaluates ChatGPT on 7 different tasks, covering 37 diverse languages with high, medium, low, and extremely low resources. Compared to the performance of previous models, our extensive experimental results demonstrate a worse performance of ChatGPT for different NLP tasks and languages.
arXiv Detail & Related papers (2023-04-12T05:08:52Z)
To ChatGPT, or not to ChatGPT: That is the question! [78.407861566006]
This study provides a comprehensive and contemporary assessment of the most recent techniques in ChatGPT detection. We have curated a benchmark dataset consisting of prompts from ChatGPT and humans, including diverse questions from medical, open Q&A, and finance domains. Our evaluation results demonstrate that none of the existing methods can effectively detect ChatGPT-generated content.
arXiv Detail & Related papers (2023-04-04T03:04:28Z)
Consistency Analysis of ChatGPT [65.268245109828]
This paper investigates the trustworthiness of ChatGPT and GPT-4 regarding logically consistent behaviour. Our findings suggest that while both models appear to show an enhanced language understanding and reasoning ability, they still frequently fall short of generating logically consistent predictions.
arXiv Detail & Related papers (2023-03-11T01:19:01Z)
Can ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERT [103.57103957631067]
ChatGPT has attracted great attention, as it can generate fluent and high-quality responses to human inquiries. We evaluate ChatGPT's understanding ability by evaluating it on the most popular GLUE benchmark, and comparing it with 4 representative fine-tuned BERT-style models. We find that: 1) ChatGPT falls short in handling paraphrase and similarity tasks; 2) ChatGPT outperforms all BERT models on inference tasks by a large margin; 3) ChatGPT achieves comparable performance compared with BERT on sentiment analysis and question answering tasks.
arXiv Detail & Related papers (2023-02-19T12:29:33Z)
Is ChatGPT a General-Purpose Natural Language Processing Task Solver? [113.22611481694825]
Large language models (LLMs) have demonstrated the ability to perform a variety of natural language processing (NLP) tasks zero-shot. Recently, the debut of ChatGPT has drawn a great deal of attention from the natural language processing (NLP) community. It is not yet known whether ChatGPT can serve as a generalist model that can perform many NLP tasks zero-shot.
arXiv Detail & Related papers (2023-02-08T09:44:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.