Towards a Robust Detection of Language Model Generated Text: Is ChatGPT
that Easy to Detect?
- URL: http://arxiv.org/abs/2306.05871v1
- Date: Fri, 9 Jun 2023 13:03:53 GMT
- Title: Towards a Robust Detection of Language Model Generated Text: Is ChatGPT
that Easy to Detect?
- Authors: Wissam Antoun, Virginie Mouilleron, Beno\^it Sagot, Djam\'e Seddah
- Abstract summary: This paper proposes a methodology for developing and evaluating ChatGPT detectors for French text.
The proposed method involves translating an English dataset into French and training a classifier on the translated data.
Results show that the detectors can effectively detect ChatGPT-generated text, with a degree of robustness against basic attack techniques in in-domain settings.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Recent advances in natural language processing (NLP) have led to the
development of large language models (LLMs) such as ChatGPT. This paper
proposes a methodology for developing and evaluating ChatGPT detectors for
French text, with a focus on investigating their robustness on out-of-domain
data and against common attack schemes. The proposed method involves
translating an English dataset into French and training a classifier on the
translated data. Results show that the detectors can effectively detect
ChatGPT-generated text, with a degree of robustness against basic attack
techniques in in-domain settings. However, vulnerabilities are evident in
out-of-domain contexts, highlighting the challenge of detecting adversarial
text. The study emphasizes caution when applying in-domain testing results to a
wider variety of content. We provide our translated datasets and models as
open-source resources. https://gitlab.inria.fr/wantoun/robust-chatgpt-detection
Related papers
- Spotting AI's Touch: Identifying LLM-Paraphrased Spans in Text [61.22649031769564]
We propose a novel framework, paraphrased text span detection (PTD)
PTD aims to identify paraphrased text spans within a text.
We construct a dedicated dataset, PASTED, for paraphrased text span detection.
arXiv Detail & Related papers (2024-05-21T11:22:27Z) - GPT-generated Text Detection: Benchmark Dataset and Tensor-based
Detection Method [4.802604527842989]
We present GPT Reddit dataset (GRiD), a novel Generative Pretrained Transformer (GPT)-generated text detection dataset.
The dataset consists of context-prompt pairs based on Reddit with human-generated and ChatGPT-generated responses.
To showcase the dataset's utility, we benchmark several detection methods on it, demonstrating their efficacy in distinguishing between human and ChatGPT-generated responses.
arXiv Detail & Related papers (2024-03-12T05:15:21Z) - DetectGPT-SC: Improving Detection of Text Generated by Large Language
Models through Self-Consistency with Masked Predictions [13.077729125193434]
Existing detectors are built on the assumption that there is a distribution gap between human-generated and AI-generated texts.
We find that large language models such as ChatGPT exhibit strong self-consistency in text generation and continuation.
We propose a new method for AI-generated texts detection based on self-consistency with masked predictions.
arXiv Detail & Related papers (2023-10-23T01:23:10Z) - SeqXGPT: Sentence-Level AI-Generated Text Detection [62.3792779440284]
We introduce a sentence-level detection challenge by synthesizing documents polished with large language models (LLMs)
We then propose textbfSequence textbfX (Check) textbfGPT, a novel method that utilizes log probability lists from white-box LLMs as features for sentence-level AIGT detection.
arXiv Detail & Related papers (2023-10-13T07:18:53Z) - Fighting Fire with Fire: Can ChatGPT Detect AI-generated Text? [20.37071875344405]
We evaluate the zero-shot performance of ChatGPT in the task of human-written vs. AI-generated text detection.
We empirically investigate if ChatGPT is symmetrically effective in detecting AI-generated or human-written text.
arXiv Detail & Related papers (2023-08-02T17:11:37Z) - Is ChatGPT Involved in Texts? Measure the Polish Ratio to Detect
ChatGPT-Generated Text [48.36706154871577]
We introduce a novel dataset termed HPPT (ChatGPT-polished academic abstracts)
It diverges from extant corpora by comprising pairs of human-written and ChatGPT-polished abstracts instead of purely ChatGPT-generated texts.
We also propose the "Polish Ratio" method, an innovative measure of the degree of modification made by ChatGPT compared to the original human-written text.
arXiv Detail & Related papers (2023-07-21T06:38:37Z) - Multiscale Positive-Unlabeled Detection of AI-Generated Texts [27.956604193427772]
Multiscale Positive-Unlabeled (MPU) training framework is proposed to address the difficulty of short-text detection.
MPU method augments detection performance on long AI-generated texts, and significantly improves short-text detection of language model detectors.
arXiv Detail & Related papers (2023-05-29T15:25:00Z) - MAGE: Machine-generated Text Detection in the Wild [82.70561073277801]
Large language models (LLMs) have achieved human-level text generation, emphasizing the need for effective AI-generated text detection.
We build a comprehensive testbed by gathering texts from diverse human writings and texts generated by different LLMs.
Despite challenges, the top-performing detector can identify 86.54% out-of-domain texts generated by a new LLM, indicating the feasibility for application scenarios.
arXiv Detail & Related papers (2023-05-22T17:13:29Z) - To ChatGPT, or not to ChatGPT: That is the question! [78.407861566006]
This study provides a comprehensive and contemporary assessment of the most recent techniques in ChatGPT detection.
We have curated a benchmark dataset consisting of prompts from ChatGPT and humans, including diverse questions from medical, open Q&A, and finance domains.
Our evaluation results demonstrate that none of the existing methods can effectively detect ChatGPT-generated content.
arXiv Detail & Related papers (2023-04-04T03:04:28Z) - Paraphrasing evades detectors of AI-generated text, but retrieval is an
effective defense [56.077252790310176]
We present a paraphrase generation model (DIPPER) that can paraphrase paragraphs, condition on surrounding context, and control lexical diversity and content reordering.
Using DIPPER to paraphrase text generated by three large language models (including GPT3.5-davinci-003) successfully evades several detectors, including watermarking.
We introduce a simple defense that relies on retrieving semantically-similar generations and must be maintained by a language model API provider.
arXiv Detail & Related papers (2023-03-23T16:29:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.