An Information-Theoretic Approach for Detecting Edits in AI-Generated Text
- URL: http://arxiv.org/abs/2308.12747v2
- Date: Sat, 24 Aug 2024 00:15:28 GMT
- Title: An Information-Theoretic Approach for Detecting Edits in AI-Generated Text
- Authors: Idan Kashtan, Alon Kipnis,
- Abstract summary: We propose a method to determine whether a given article was written entirely by a generative language model or perhaps contains edits by a different author, possibly a human.
We demonstrate the effectiveness of the method in detecting edits through extensive evaluations using real data.
Our analysis raises several interesting research questions at the intersection of information theory and data science.
- Score: 7.013432243663526
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a method to determine whether a given article was written entirely by a generative language model or perhaps contains edits by a different author, possibly a human. Our process involves multiple tests for the origin of individual sentences or other pieces of text and combining these tests using a method that is sensitive to rare alternatives, i.e., non-null effects are few and scattered across the text in unknown locations. Interestingly, this method also identifies pieces of text suspected to contain edits. We demonstrate the effectiveness of the method in detecting edits through extensive evaluations using real data and provide an information-theoretic analysis of the factors affecting its success. In particular, we discuss optimality properties under a theoretical framework for text editing saying that sentences are generated mainly by the language model, except perhaps for a few sentences that might have originated via a different mechanism. Our analysis raises several interesting research questions at the intersection of information theory and data science.
Related papers
- Discovering influential text using convolutional neural networks [0.0]
We present a method for discovering clusters of similar text phrases that are predictive of human reactions to texts using convolutional neural networks.
We apply the method to two datasets. The first enables direct validation of the model's ability to detect phrases known to cause the outcome.
In both cases, the model learns a greater variety of text treatments compared to benchmark methods, and these text features quantitatively meet or exceed the ability of benchmark methods to predict the outcome.
arXiv Detail & Related papers (2024-06-14T14:41:44Z) - Who Writes the Review, Human or AI? [0.36498648388765503]
This study proposes a methodology to accurately distinguish AI-generated and human-written book reviews.
Our approach utilizes transfer learning, enabling the model to identify generated text across different topics.
The experimental results demonstrate that it is feasible to detect the original source of text, achieving an accuracy rate of 96.86%.
arXiv Detail & Related papers (2024-05-30T17:38:44Z) - Spotting AI's Touch: Identifying LLM-Paraphrased Spans in Text [61.22649031769564]
We propose a novel framework, paraphrased text span detection (PTD)
PTD aims to identify paraphrased text spans within a text.
We construct a dedicated dataset, PASTED, for paraphrased text span detection.
arXiv Detail & Related papers (2024-05-21T11:22:27Z) - MAGE: Machine-generated Text Detection in the Wild [82.70561073277801]
Large language models (LLMs) have achieved human-level text generation, emphasizing the need for effective AI-generated text detection.
We build a comprehensive testbed by gathering texts from diverse human writings and texts generated by different LLMs.
Despite challenges, the top-performing detector can identify 86.54% out-of-domain texts generated by a new LLM, indicating the feasibility for application scenarios.
arXiv Detail & Related papers (2023-05-22T17:13:29Z) - Verifying the Robustness of Automatic Credibility Assessment [79.08422736721764]
Text classification methods have been widely investigated as a way to detect content of low credibility.
In some cases insignificant changes in input text can mislead the models.
We introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks.
arXiv Detail & Related papers (2023-03-14T16:11:47Z) - A Deep Learning Anomaly Detection Method in Textual Data [0.45687771576879593]
We propose using deep learning and transformer architectures combined with classical machine learning algorithms.
We used multiple machine learning methods such as Sentence Transformers, Autos, Logistic Regression and Distance calculation methods to predict anomalies.
arXiv Detail & Related papers (2022-11-25T05:18:13Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Human-in-the-Loop Disinformation Detection: Stance, Sentiment, or
Something Else? [93.91375268580806]
Both politics and pandemics have recently provided ample motivation for the development of machine learning-enabled disinformation (a.k.a. fake news) detection algorithms.
Existing literature has focused primarily on the fully-automated case, but the resulting techniques cannot reliably detect disinformation on the varied topics, sources, and time scales required for military applications.
By leveraging an already-available analyst as a human-in-the-loop, canonical machine learning techniques of sentiment analysis, aspect-based sentiment analysis, and stance detection become plausible methods to use for a partially-automated disinformation detection system.
arXiv Detail & Related papers (2021-11-09T13:30:34Z) - The Sensitivity of Word Embeddings-based Author Detection Models to
Semantic-preserving Adversarial Perturbations [3.7552532139404797]
Authorship analysis is an important subject in the field of natural language processing.
This paper explores the limitations and sensitiveness of established approaches to adversarial manipulations of inputs.
arXiv Detail & Related papers (2021-02-23T19:55:45Z) - Method of the coherence evaluation of Ukrainian text [0.0]
Methods for text coherence measurements for Ukrainian language are analyzed.
Training and examination procedures are made on the corpus of Ukrainian texts.
Test procedure is implemented by performing of two typical tasks for the text coherence assessment.
arXiv Detail & Related papers (2020-10-31T16:48:55Z) - A computational model implementing subjectivity with the 'Room Theory'.
The case of detecting Emotion from Text [68.8204255655161]
This work introduces a new method to consider subjectivity and general context dependency in text analysis.
By using similarity measure between words, we are able to extract the relative relevance of the elements in the benchmark.
This method could be applied to all the cases where evaluating subjectivity is relevant to understand the relative value or meaning of a text.
arXiv Detail & Related papers (2020-05-12T21:26:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.