Related papers: Human-AI Collaboration or Academic Misconduct? Measuring AI Use in Student Writing Through Stylometric Evidence

Human-AI Collaboration or Academic Misconduct? Measuring AI Use in Student Writing Through Stylometric Evidence

URL: http://arxiv.org/abs/2505.08828v1
Date: Tue, 13 May 2025 00:36:36 GMT
Title: Human-AI Collaboration or Academic Misconduct? Measuring AI Use in Student Writing Through Stylometric Evidence
Authors: Eduardo Araujo Oliveira, Madhavi Mohoni, Sonsoles López-Pernas, Mohammed Saqr,
Abstract summary: This research investigates the use of authorship verification (AV) techniques to quantify AI assistance in academic writing.<n>We use three datasets - including a public dataset (PAN-14) and two from University of Melbourne students from various courses.<n>We develop an adapted Feature Vector Difference AV methodology to construct robust academic writing profiles for students.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As human-AI collaboration becomes increasingly prevalent in educational contexts, understanding and measuring the extent and nature of such interactions pose significant challenges. This research investigates the use of authorship verification (AV) techniques not as a punitive measure, but as a means to quantify AI assistance in academic writing, with a focus on promoting transparency, interpretability, and student development. Building on prior work, we structured our investigation into three stages: dataset selection and expansion, AV method development, and systematic evaluation. Using three datasets - including a public dataset (PAN-14) and two from University of Melbourne students from various courses - we expanded the data to include LLM-generated texts, totalling 1,889 documents and 540 authorship problems from 506 students. We developed an adapted Feature Vector Difference AV methodology to construct robust academic writing profiles for students, designed to capture meaningful, individual characteristics of their writing. The method's effectiveness was evaluated across multiple scenarios, including distinguishing between student-authored and LLM-generated texts and testing resilience against LLMs' attempts to mimic student writing styles. Results demonstrate the enhanced AV classifier's ability to identify stylometric discrepancies and measure human-AI collaboration at word and sentence levels while providing educators with a transparent tool to support academic integrity investigations. This work advances AV technology, offering actionable insights into the dynamics of academic writing in an AI-driven era.

Related papers

Do Students Write Better Post-AI Support? Effects of Generative AI Literacy and Chatbot Interaction Strategies on Multimodal Academic Writing [0.0]
Academic writing increasingly involves multimodal tasks requiring students to integrate visual information and textual arguments.<n>While generative AI (GenAI) tools, like ChatGPT, offer new pathways for supporting academic writing, little is known about how students' GenAI literacy influences their independent multimodal writing skills.<n>This study examined 79 higher education students' multimodal academic writing performance using a comparative research design.
arXiv Detail & Related papers (2025-07-06T14:01:06Z)
AI in the Writing Process: How Purposeful AI Support Fosters Student Writing [0.3641292357963815]
The ubiquity of technologies like ChatGPT has raised concerns about their impact on student writing.<n>This paper investigates how different AI support approaches affect writers' sense of agency and depth of knowledge transformation.
arXiv Detail & Related papers (2025-06-25T16:34:09Z)
STRICTA: Structured Reasoning in Critical Text Assessment for Peer Review and Beyond [68.47402386668846]
We introduce Structured Reasoning In Critical Text Assessment (STRICTA) to model text assessment as an explicit, step-wise reasoning process.<n>STRICTA breaks down the assessment into a graph of interconnected reasoning steps drawing on causality theory.<n>We apply STRICTA to a dataset of over 4000 reasoning steps from roughly 40 biomedical experts on more than 20 papers.
arXiv Detail & Related papers (2024-09-09T06:55:37Z)
Systematic Task Exploration with LLMs: A Study in Citation Text Generation [63.50597360948099]
Large language models (LLMs) bring unprecedented flexibility in defining and executing complex, creative natural language generation (NLG) tasks. We propose a three-component research framework that consists of systematic input manipulation, reference data, and output measurement. We use this framework to explore citation text generation -- a popular scholarly NLP task that lacks consensus on the task definition and evaluation metric.
arXiv Detail & Related papers (2024-07-04T16:41:08Z)
Inclusivity in Large Language Models: Personality Traits and Gender Bias in Scientific Abstracts [49.97673761305336]
We evaluate three large language models (LLMs) for their alignment with human narrative styles and potential gender biases. Our findings indicate that, while these models generally produce text closely resembling human authored content, variations in stylistic features suggest significant gender biases.
arXiv Detail & Related papers (2024-06-27T19:26:11Z)
Detecting AI-Generated Text: Factors Influencing Detectability with Current Methods [13.14749943120523]
Knowing whether a text was produced by human or artificial intelligence (AI) is important to determining its trustworthiness.<n>State-of-the art approaches to AIGT detection include watermarking, statistical and stylistic analysis, and machine learning classification.<n>We aim to provide insight into the salient factors that combine to determine how "detectable" AIGT text is under different scenarios.
arXiv Detail & Related papers (2024-06-21T18:31:49Z)
A Comprehensive Survey on Underwater Image Enhancement Based on Deep Learning [51.7818820745221]
Underwater image enhancement (UIE) presents a significant challenge within computer vision research. Despite the development of numerous UIE algorithms, a thorough and systematic review is still absent.
arXiv Detail & Related papers (2024-05-30T04:46:40Z)
Instruction Tuning for Large Language Models: A Survey [52.86322823501338]
We make a systematic review of the literature, including the general methodology of supervised fine-tuning (SFT)<n>We also review the potential pitfalls of SFT along with criticism against it, along with efforts pointing out current deficiencies of existing strategies.
arXiv Detail & Related papers (2023-08-21T15:35:16Z)
Towards Automatic Boundary Detection for Human-AI Collaborative Hybrid Essay in Education [10.606131520965604]
This study investigates AI content detection in a rarely explored yet realistic setting. We first formalized the detection task as identifying the transition points between human-written content and AI-generated content. We then proposed a two-step approach where we separated AI-generated content from human-written content during the encoder training process.
arXiv Detail & Related papers (2023-07-23T08:47:51Z)
LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark [81.42376626294812]
We present Language-Assisted Multi-Modal instruction tuning dataset, framework, and benchmark. Our aim is to establish LAMM as a growing ecosystem for training and evaluating MLLMs. We present a comprehensive dataset and benchmark, which cover a wide range of vision tasks for 2D and 3D vision.
arXiv Detail & Related papers (2023-06-11T14:01:17Z)
HowkGPT: Investigating the Detection of ChatGPT-generated University Student Homework through Context-Aware Perplexity Analysis [6.935900707354898]
HowkGPT is built upon a dataset of academic assignments and accompanying metadata.<n>It computes perplexity scores for student-authored and ChatGPT-generated responses.<n>It further refines its analysis by defining category-specific thresholds.
arXiv Detail & Related papers (2023-05-26T11:07:25Z)
Investigating Fairness Disparities in Peer Review: A Language Model Enhanced Approach [77.61131357420201]
We conduct a thorough and rigorous study on fairness disparities in peer review with the help of large language models (LMs) We collect, assemble, and maintain a comprehensive relational database for the International Conference on Learning Representations (ICLR) conference from 2017 to date. We postulate and study fairness disparities on multiple protective attributes of interest, including author gender, geography, author, and institutional prestige.
arXiv Detail & Related papers (2022-11-07T16:19:42Z)
Personalized Education in the AI Era: What to Expect Next? [76.37000521334585]
The objective of personalized learning is to design an effective knowledge acquisition track that matches the learner's strengths and bypasses her weaknesses to meet her desired goal. In recent years, the boost of artificial intelligence (AI) and machine learning (ML) has unfolded novel perspectives to enhance personalized education.
arXiv Detail & Related papers (2021-01-19T12:23:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.