LLM-Assisted Cheating Detection in Korean Language via Keystrokes
- URL: http://arxiv.org/abs/2507.22956v1
- Date: Tue, 29 Jul 2025 20:59:03 GMT
- Title: LLM-Assisted Cheating Detection in Korean Language via Keystrokes
- Authors: Dong Hyun Roh, Rajesh Kumar, An Ngo,
- Abstract summary: This paper presents a keystroke-based framework for detecting LLM-assisted cheating in Korean.<n>Our dataset includes 69 participants who completed writing tasks under three conditions: Bona fide writing, paraphrasing ChatGPT responses, and transcribing ChatGPT responses.
- Score: 1.9344365651682767
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper presents a keystroke-based framework for detecting LLM-assisted cheating in Korean, addressing key gaps in prior research regarding language coverage, cognitive context, and the granularity of LLM involvement. Our proposed dataset includes 69 participants who completed writing tasks under three conditions: Bona fide writing, paraphrasing ChatGPT responses, and transcribing ChatGPT responses. Each task spans six cognitive processes defined in Bloom's Taxonomy (remember, understand, apply, analyze, evaluate, and create). We extract interpretable temporal and rhythmic features and evaluate multiple classifiers under both Cognition-Aware and Cognition-Unaware settings. Temporal features perform well under Cognition-Aware evaluation scenarios, while rhythmic features generalize better under cross-cognition scenarios. Moreover, detecting bona fide and transcribed responses was easier than paraphrased ones for both the proposed models and human evaluators, with the models significantly outperforming the humans. Our findings affirm that keystroke dynamics facilitate reliable detection of LLM-assisted writing across varying cognitive demands and writing strategies, including paraphrasing and transcribing LLM-generated responses.
Related papers
- An Empirical Study of Federated Prompt Learning for Vision Language Model [50.73746120012352]
This paper systematically investigates behavioral differences between language prompt learning and vision prompt learning.<n>We conduct experiments to evaluate the impact of various fl and prompt configurations, such as client scale, aggregation strategies, and prompt length.<n>We explore strategies for enhancing prompt learning in complex scenarios where label skew and domain shift coexist.
arXiv Detail & Related papers (2025-05-29T03:09:15Z) - Comparing LLM Text Annotation Skills: A Study on Human Rights Violations in Social Media Data [2.812898346527047]
This study investigates the capabilities of large language models (LLMs) for zero-shot and few-shot annotation of social media posts in Russian and Ukrainian.<n>To evaluate the effectiveness of these models, their annotations are compared against a gold standard set of human double-annotated labels.<n>The study explores the unique patterns of errors and disagreements exhibited by each model, offering insights into their strengths, limitations, and cross-linguistic adaptability.
arXiv Detail & Related papers (2025-05-15T13:10:47Z) - Say It Another Way: Auditing LLMs with a User-Grounded Automated Paraphrasing Framework [9.162876771766513]
We introduce AUGMENT, a framework for generating controlled, realistic prompt paraphrases based on linguistic structure and user demographics.<n>AUGMENT ensures paraphrase quality through a combination of semantic, stylistic, and instruction-following criteria.<n>Our findings highlight the need for more representative and structured approaches to prompt variation in large language models.
arXiv Detail & Related papers (2025-05-06T14:17:30Z) - Beyond Profile: From Surface-Level Facts to Deep Persona Simulation in LLMs [50.0874045899661]
We introduce CharacterBot, a model designed to replicate both the linguistic patterns and distinctive thought patterns as manifested in the textual works of a character.<n>Using Lu Xun, a renowned Chinese writer as a case study, we propose four training tasks derived from his 17 essay collections.<n>These include a pre-training task focused on mastering external linguistic structures and knowledge, as well as three fine-tuning tasks.<n>We evaluate CharacterBot on three tasks for linguistic accuracy and opinion comprehension, demonstrating that it significantly outperforms the baselines on our adapted metrics.
arXiv Detail & Related papers (2025-02-18T16:11:54Z) - Detecting Memorization in Large Language Models [0.0]
Large language models (LLMs) have achieved impressive results in natural language processing but are prone to memorizing portions of their training data.<n>Traditional methods for detecting memorization rely on output probabilities or loss functions.<n>We introduce an analytical method that precisely detects memorization by examining neuron activations within the LLM.
arXiv Detail & Related papers (2024-12-02T00:17:43Z) - Pronunciation Assessment with Multi-modal Large Language Models [10.35401596425946]
We propose a scoring system based on large language models (LLMs)
The speech encoder first maps the learner's speech into contextual features.
The adapter layer then transforms these features to align with the text embedding in latent space.
arXiv Detail & Related papers (2024-07-12T12:16:14Z) - Re-Reading Improves Reasoning in Large Language Models [87.46256176508376]
We introduce a simple, yet general and effective prompting method, Re2, to enhance the reasoning capabilities of off-the-shelf Large Language Models (LLMs)
Unlike most thought-eliciting prompting methods, such as Chain-of-Thought (CoT), Re2 shifts the focus to the input by processing questions twice, thereby enhancing the understanding process.
We evaluate Re2 on extensive reasoning benchmarks across 14 datasets, spanning 112 experiments, to validate its effectiveness and generality.
arXiv Detail & Related papers (2023-09-12T14:36:23Z) - Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue
Questions with LLMs [59.74002011562726]
We propose a novel linguistic cue-based chain-of-thoughts (textitCue-CoT) to provide a more personalized and engaging response.
We build a benchmark with in-depth dialogue questions, consisting of 6 datasets in both Chinese and English.
Empirical results demonstrate our proposed textitCue-CoT method outperforms standard prompting methods in terms of both textithelpfulness and textitacceptability on all datasets.
arXiv Detail & Related papers (2023-05-19T16:27:43Z) - Prompting Language Models for Linguistic Structure [73.11488464916668]
We present a structured prompting approach for linguistic structured prediction tasks.
We evaluate this approach on part-of-speech tagging, named entity recognition, and sentence chunking.
We find that while PLMs contain significant prior knowledge of task labels due to task leakage into the pretraining corpus, structured prompting can also retrieve linguistic structure with arbitrary labels.
arXiv Detail & Related papers (2022-11-15T01:13:39Z) - Neuro-Symbolic Causal Language Planning with Commonsense Prompting [67.06667162430118]
Language planning aims to implement complex high-level goals by decomposition into simpler low-level steps.
Previous methods require either manual exemplars or annotated programs to acquire such ability from large language models.
This paper proposes Neuro-Symbolic Causal Language Planner (CLAP) that elicits procedural knowledge from the LLMs with commonsense-infused prompting.
arXiv Detail & Related papers (2022-06-06T22:09:52Z) - Learning to Ask Conversational Questions by Optimizing Levenshtein
Distance [83.53855889592734]
We introduce a Reinforcement Iterative Sequence Editing (RISE) framework that optimize the minimum Levenshtein distance (MLD) through explicit editing actions.
RISE is able to pay attention to tokens that are related to conversational characteristics.
Experimental results on two benchmark datasets show that RISE significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-06-30T08:44:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.