Student Mastery or AI Deception? Analyzing ChatGPT's Assessment
Proficiency and Evaluating Detection Strategies
- URL: http://arxiv.org/abs/2311.16292v1
- Date: Mon, 27 Nov 2023 20:10:13 GMT
- Title: Student Mastery or AI Deception? Analyzing ChatGPT's Assessment
Proficiency and Evaluating Detection Strategies
- Authors: Kevin Wang, Seth Akins, Abdallah Mohammed, Ramon Lawrence
- Abstract summary: Generative AI systems such as ChatGPT have a disruptive effect on learning and assessment.
This work investigates the performance of ChatGPT by evaluating it across three courses.
- Score: 1.633179643849375
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative AI systems such as ChatGPT have a disruptive effect on learning
and assessment. Computer science requires practice to develop skills in problem
solving and programming that are traditionally developed using assignments.
Generative AI has the capability of completing these assignments for students
with high accuracy, which dramatically increases the potential for academic
integrity issues and students not achieving desired learning outcomes. This
work investigates the performance of ChatGPT by evaluating it across three
courses (CS1,CS2,databases). ChatGPT completes almost all introductory
assessments perfectly. Existing detection methods, such as MOSS and JPlag
(based on similarity metrics) and GPTzero (AI detection), have mixed success in
identifying AI solutions. Evaluating instructors and teaching assistants using
heuristics to distinguish between student and AI code shows that their
detection is not sufficiently accurate. These observations emphasize the need
for adapting assessments and improved detection methods.
Related papers
- Could ChatGPT get an Engineering Degree? Evaluating Higher Education Vulnerability to AI Assistants [175.9723801486487]
We evaluate whether two AI assistants, GPT-3.5 and GPT-4, can adequately answer assessment questions.
GPT-4 answers an average of 65.8% of questions correctly, and can even produce the correct answer across at least one prompting strategy for 85.1% of questions.
Our results call for revising program-level assessment design in higher education in light of advances in generative AI.
arXiv Detail & Related papers (2024-08-07T12:11:49Z) - Offline Imitation Learning Through Graph Search and Retrieval [57.57306578140857]
Imitation learning is a powerful machine learning algorithm for a robot to acquire manipulation skills.
We propose GSR, a simple yet effective algorithm that learns from suboptimal demonstrations through Graph Search and Retrieval.
GSR can achieve a 10% to 30% higher success rate and over 30% higher proficiency compared to baselines.
arXiv Detail & Related papers (2024-07-22T06:12:21Z) - The AI Companion in Education: Analyzing the Pedagogical Potential of ChatGPT in Computer Science and Engineering [1.120999712480549]
This study aims to comprehensively analyze the pedagogical potential of ChatGPT in CSE education.
We employ a systematic approach, creating a diverse range of educational practice problems within CSE field.
According to our examinations, certain question types, like conceptual knowledge queries, typically do not pose significant challenges to ChatGPT.
arXiv Detail & Related papers (2024-04-23T21:42:30Z) - GenAI Detection Tools, Adversarial Techniques and Implications for Inclusivity in Higher Education [0.0]
This study investigates the efficacy of six major Generative AI (GenAI) text detectors when confronted with machine-generated content that has been modified.
The results demonstrate that the detectors' already low accuracy rates (39.5%) show major reductions in accuracy (17.4%) when faced with manipulated content.
The accuracy limitations and the potential for false accusations demonstrate that these tools cannot currently be recommended for determining whether violations of academic integrity have occurred.
arXiv Detail & Related papers (2024-03-28T04:57:13Z) - GPT as Psychologist? Preliminary Evaluations for GPT-4V on Visual Affective Computing [74.68232970965595]
Multimodal large language models (MLLMs) are designed to process and integrate information from multiple sources, such as text, speech, images, and videos.
This paper assesses the application of MLLMs with 5 crucial abilities for affective computing, spanning from visual affective tasks and reasoning tasks.
arXiv Detail & Related papers (2024-03-09T13:56:25Z) - ChatGPT is not a pocket calculator -- Problems of AI-chatbots for
teaching Geography [0.11049608786515837]
ChatGPT can be fraudulent because it threatens the validity of assessments.
Based on a preliminary survey on ChatGPT's quality in answering questions in Geography and GIScience, we demonstrate that this assumption might be fairly naive.
arXiv Detail & Related papers (2023-07-03T15:35:21Z) - Perception, performance, and detectability of conversational artificial
intelligence across 32 university courses [15.642614735026106]
We compare the performance of ChatGPT against students on 32 university-level courses.
We find that ChatGPT's performance is comparable, if not superior, to that of students in many courses.
We find an emerging consensus among students to use the tool, and among educators to treat this as plagiarism.
arXiv Detail & Related papers (2023-05-07T10:37:51Z) - To ChatGPT, or not to ChatGPT: That is the question! [78.407861566006]
This study provides a comprehensive and contemporary assessment of the most recent techniques in ChatGPT detection.
We have curated a benchmark dataset consisting of prompts from ChatGPT and humans, including diverse questions from medical, open Q&A, and finance domains.
Our evaluation results demonstrate that none of the existing methods can effectively detect ChatGPT-generated content.
arXiv Detail & Related papers (2023-04-04T03:04:28Z) - ChatGPT: The End of Online Exam Integrity? [0.0]
This study evaluated the ability of ChatGPT, a recently developed artificial intelligence (AI) agent, to perform high-level cognitive tasks.
It raises concerns about the potential use of ChatGPT as a tool for academic misconduct in online exams.
arXiv Detail & Related papers (2022-12-19T08:15:16Z) - The Role of AI in Drug Discovery: Challenges, Opportunities, and
Strategies [97.5153823429076]
The benefits, challenges and drawbacks of AI in this field are reviewed.
The use of data augmentation, explainable AI, and the integration of AI with traditional experimental methods are also discussed.
arXiv Detail & Related papers (2022-12-08T23:23:39Z) - Towards Diverse Evaluation of Class Incremental Learning: A Representation Learning Perspective [67.45111837188685]
Class incremental learning (CIL) algorithms aim to continually learn new object classes from incrementally arriving data.
We experimentally analyze neural network models trained by CIL algorithms using various evaluation protocols in representation learning.
arXiv Detail & Related papers (2022-06-16T11:44:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.