Related papers: Leveraging Explainable AI for LLM Text Attribution: Differentiating Human-Written and Multiple LLMs-Generated Text

Leveraging Explainable AI for LLM Text Attribution: Differentiating Human-Written and Multiple LLMs-Generated Text

URL: http://arxiv.org/abs/2501.03212v1
Date: Mon, 06 Jan 2025 18:46:53 GMT
Title: Leveraging Explainable AI for LLM Text Attribution: Differentiating Human-Written and Multiple LLMs-Generated Text
Authors: Ayat Najjar, Huthaifa I. Ashqar, Omar Darwish, Eman Hammad,
Abstract summary: This study aims to support efforts to detect and identify textual content generated using Generative AI Large Language Models.<n>We leverage several machine learning algorithms such as Random Forest (RF), and Recurrent Neural Networks (RNN) to understand the important features in attribution.<n>Our method is divided into 1) binary classification to differentiate between human-written and AI-text, and 2) multi classification, to differentiate between human-written text and the text generated by the five different LLM tools.
Score: 1.1137087573421256
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The development of Generative AI Large Language Models (LLMs) raised the alarm regarding identifying content produced through generative AI or humans. In one case, issues arise when students heavily rely on such tools in a manner that can affect the development of their writing or coding skills. Other issues of plagiarism also apply. This study aims to support efforts to detect and identify textual content generated using LLM tools. We hypothesize that LLMs-generated text is detectable by machine learning (ML), and investigate ML models that can recognize and differentiate texts generated by multiple LLMs tools. We leverage several ML and Deep Learning (DL) algorithms such as Random Forest (RF), and Recurrent Neural Networks (RNN), and utilized Explainable Artificial Intelligence (XAI) to understand the important features in attribution. Our method is divided into 1) binary classification to differentiate between human-written and AI-text, and 2) multi classification, to differentiate between human-written text and the text generated by the five different LLM tools (ChatGPT, LLaMA, Google Bard, Claude, and Perplexity). Results show high accuracy in the multi and binary classification. Our model outperformed GPTZero with 98.5\% accuracy to 78.3\%. Notably, GPTZero was unable to recognize about 4.2\% of the observations, but our model was able to recognize the complete test dataset. XAI results showed that understanding feature importance across different classes enables detailed author/source profiles. Further, aiding in attribution and supporting plagiarism detection by highlighting unique stylistic and structural elements ensuring robust content originality verification.

Related papers

mdok of KInIT: Robustly Fine-tuned LLM for Binary and Multiclass AI-Generated Text Detection [0.0]
An automated detection is able to assist humans to indicate the machine-generated texts.<n>This notebook describes our mdok approach in robust detection, based on fine-tuning smaller LLMs for text classification.<n>It is applied to both subtasks of Voight-Kampff Generative AI Detection 2025.
arXiv Detail & Related papers (2025-06-02T14:07:32Z)
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders [20.557610461777344]
We use Sparse Autoencoders (SAE) to extract features from Gemma-2-2b residual stream. We identify both interpretable and efficient features, analyzing their semantics and relevance. Our methods offer valuable insights into how texts from various models differ from human-written content.
arXiv Detail & Related papers (2025-03-05T15:33:52Z)
"I know myself better, but not really greatly": Using LLMs to Detect and Explain LLM-Generated Texts [10.454446545249096]
Large language models (LLMs) have demonstrated impressive capabilities in generating human-like texts. This paper explores the detection and explanation capabilities of LLM-based detectors of human-generated texts.
arXiv Detail & Related papers (2025-02-18T11:00:28Z)
Idiosyncrasies in Large Language Models [54.26923012617675]
We unveil and study idiosyncrasies in Large Language Models (LLMs) We find that fine-tuning existing text embedding models on LLM-generated texts yields excellent classification accuracy. We leverage LLM as judges to generate detailed, open-ended descriptions of each model's idiosyncrasies.
arXiv Detail & Related papers (2025-02-17T18:59:02Z)
GigaCheck: Detecting LLM-generated Content [72.27323884094953]
In this work, we investigate the task of generated text detection by proposing the GigaCheck. Our research explores two approaches: (i) distinguishing human-written texts from LLM-generated ones, and (ii) detecting LLM-generated intervals in Human-Machine collaborative texts. Specifically, we use a fine-tuned general-purpose LLM in conjunction with a DETR-like detection model, adapted from computer vision, to localize AI-generated intervals within text.
arXiv Detail & Related papers (2024-10-31T08:30:55Z)
Which LLMs are Difficult to Detect? A Detailed Analysis of Potential Factors Contributing to Difficulties in LLM Text Detection [43.66875548677324]
We train AI-generated (AIG) text classifiers using the LibAUC library for training classifiers with imbalanced datasets.<n>Our results in the Deepfake Text dataset show that AIG-text detection varies across domains, with scientific writing being relatively challenging.<n>In the Rewritten Ivy Panda dataset focusing on student essays, we find that the OpenAI family of LLMs was substantially difficult for our classifiers to distinguish from human texts.
arXiv Detail & Related papers (2024-10-18T21:42:37Z)
SMLT-MUGC: Small, Medium, and Large Texts -- Machine versus User-Generated Content Detection and Comparison [2.7147912878168303]
We compare the performance of machine learning algorithms on four datasets: (1) small (tweets from Election, FIFA, and Game of Thrones), (2) medium (Wikipedia introductions and PubMed abstracts), and (3) large (OpenAI web text dataset) Our results indicate that LLMs with very large parameters (such as the XL-1542 variant of GPT2 with 1542 million parameters) were harder to detect using traditional machine learning methods. We examine the characteristics of human and machine-generated texts across multiple dimensions, including linguistics, personality, sentiment, bias, and morality.
arXiv Detail & Related papers (2024-06-28T22:19:01Z)
FAC$^2$E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition [56.76951887823882]
Large language models (LLMs) are primarily evaluated by overall performance on various text understanding and generation tasks. We present FAC$2$E, a framework for Fine-grAined and Cognition-grounded LLMs' Capability Evaluation.
arXiv Detail & Related papers (2024-02-29T21:05:37Z)
Generative AI Text Classification using Ensemble LLM Approaches [0.12483023446237698]
Large Language Models (LLMs) have shown impressive performance across a variety of AI and natural language processing tasks. We propose an ensemble neural model that generates probabilities from different pre-trained LLMs. For the first task of distinguishing between AI and human generated text, our model ranked in fifth and thirteenth place.
arXiv Detail & Related papers (2023-09-14T14:41:46Z)
Neural Authorship Attribution: Stylometric Analysis on Large Language Models [16.63955074133222]
Large language models (LLMs) such as GPT-4, PaLM, and Llama have significantly propelled the generation of AI-crafted text. With rising concerns about their potential misuse, there is a pressing need for AI-generated-text forensics.
arXiv Detail & Related papers (2023-08-14T17:46:52Z)
Towards Codable Watermarking for Injecting Multi-bits Information to LLMs [86.86436777626959]
Large language models (LLMs) generate texts with increasing fluency and realism. Existing watermarking methods are encoding-inefficient and cannot flexibly meet the diverse information encoding needs. We propose Codable Text Watermarking for LLMs (CTWL) that allows text watermarks to carry multi-bit customizable information.
arXiv Detail & Related papers (2023-07-29T14:11:15Z)
Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning [51.90524745663737]
A key innovation is our use of explanations as features, which can be used to boost GNN performance on downstream tasks. Our method achieves state-of-the-art results on well-established TAG datasets. Our method significantly speeds up training, achieving a 2.88 times improvement over the closest baseline on ogbn-arxiv.
arXiv Detail & Related papers (2023-05-31T03:18:03Z)
LLMDet: A Third Party Large Language Models Generated Text Detection Tool [119.0952092533317]
Large language models (LLMs) are remarkably close to high-quality human-authored text. Existing detection tools can only differentiate between machine-generated and human-authored text. We propose LLMDet, a model-specific, secure, efficient, and extendable detection tool.
arXiv Detail & Related papers (2023-05-24T10:45:16Z)
MAGE: Machine-generated Text Detection in the Wild [82.70561073277801]
Large language models (LLMs) have achieved human-level text generation, emphasizing the need for effective AI-generated text detection. We build a comprehensive testbed by gathering texts from diverse human writings and texts generated by different LLMs. Despite challenges, the top-performing detector can identify 86.54% out-of-domain texts generated by a new LLM, indicating the feasibility for application scenarios.
arXiv Detail & Related papers (2023-05-22T17:13:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.