Related papers: Detecting ChatGPT: A Survey of the State of Detecting ChatGPT-Generated Text

Detecting ChatGPT: A Survey of the State of Detecting ChatGPT-Generated Text

URL: http://arxiv.org/abs/2309.07689v1
Date: Thu, 14 Sep 2023 13:05:20 GMT
Title: Detecting ChatGPT: A Survey of the State of Detecting ChatGPT-Generated Text
Authors: Mahdi Dhaini, Wessel Poelman, Ege Erdogan
Abstract summary: generative language models can potentially deceive by generating artificial text that appears to be human-generated. This survey provides an overview of the current approaches employed to differentiate between texts generated by humans and ChatGPT.
Score: 1.9643748953805937
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While recent advancements in the capabilities and widespread accessibility of generative language models, such as ChatGPT (OpenAI, 2022), have brought about various benefits by generating fluent human-like text, the task of distinguishing between human- and large language model (LLM) generated text has emerged as a crucial problem. These models can potentially deceive by generating artificial text that appears to be human-generated. This issue is particularly significant in domains such as law, education, and science, where ensuring the integrity of text is of the utmost importance. This survey provides an overview of the current approaches employed to differentiate between texts generated by humans and ChatGPT. We present an account of the different datasets constructed for detecting ChatGPT-generated text, the various methods utilized, what qualitative analyses into the characteristics of human versus ChatGPT-generated text have been performed, and finally, summarize our findings into general insights

Related papers

Feature Extraction and Analysis for GPT-Generated Text [0.0]
We present a comprehensive study of feature extraction and analysis for differentiating between human-written and GPT-generated text. Our results demonstrate that human and GPT-generated texts exhibit distinct writing styles, which can be effectively captured by our features.
arXiv Detail & Related papers (2025-03-17T19:52:43Z)
Detecting Machine-Generated Long-Form Content with Latent-Space Variables [54.07946647012579]
Existing zero-shot detectors primarily focus on token-level distributions, which are vulnerable to real-world domain shifts. We propose a more robust method that incorporates abstract elements, such as event transitions, as key deciding factors to detect machine versus human texts.
arXiv Detail & Related papers (2024-10-04T18:42:09Z)
GPT-generated Text Detection: Benchmark Dataset and Tensor-based Detection Method [4.802604527842989]
We present GPT Reddit dataset (GRiD), a novel Generative Pretrained Transformer (GPT)-generated text detection dataset. The dataset consists of context-prompt pairs based on Reddit with human-generated and ChatGPT-generated responses. To showcase the dataset's utility, we benchmark several detection methods on it, demonstrating their efficacy in distinguishing between human and ChatGPT-generated responses.
arXiv Detail & Related papers (2024-03-12T05:15:21Z)
DEMASQ: Unmasking the ChatGPT Wordsmith [63.8746084667206]
We propose an effective ChatGPT detector named DEMASQ, which accurately identifies ChatGPT-generated content. Our method addresses two critical factors: (i) the distinct biases in text composition observed in human- and machine-generated content and (ii) the alterations made by humans to evade previous detection methods.
arXiv Detail & Related papers (2023-11-08T21:13:05Z)
Is ChatGPT Involved in Texts? Measure the Polish Ratio to Detect ChatGPT-Generated Text [48.36706154871577]
We introduce a novel dataset termed HPPT (ChatGPT-polished academic abstracts) It diverges from extant corpora by comprising pairs of human-written and ChatGPT-polished abstracts instead of purely ChatGPT-generated texts. We also propose the "Polish Ratio" method, an innovative measure of the degree of modification made by ChatGPT compared to the original human-written text.
arXiv Detail & Related papers (2023-07-21T06:38:37Z)
ChatGPT vs Human-authored Text: Insights into Controllable Text Summarization and Sentence Style Transfer [8.64514166615844]
We conduct a systematic inspection of ChatGPT's performance in two controllable generation tasks. We evaluate the faithfulness of the generated text, and compare the model's performance with human-authored texts. We observe that ChatGPT sometimes incorporates factual errors or hallucinations when adapting the text to suit a specific style.
arXiv Detail & Related papers (2023-06-13T14:21:35Z)
GPT-Sentinel: Distinguishing Human and ChatGPT Generated Content [27.901155229342375]
We present a novel approach for detecting ChatGPT-generated vs. human-written text using language models. Our models achieved remarkable results, with an accuracy of over 97% on the test dataset, as evaluated through various metrics.
arXiv Detail & Related papers (2023-05-13T17:12:11Z)
On the Possibilities of AI-Generated Text Detection [76.55825911221434]
We argue that as machine-generated text approximates human-like quality, the sample size needed for detection bounds increases. We test various state-of-the-art text generators, including GPT-2, GPT-3.5-Turbo, Llama, Llama-2-13B-Chat-HF, and Llama-2-70B-Chat-HF, against detectors, including oBERTa-Large/Base-Detector, GPTZero.
arXiv Detail & Related papers (2023-04-10T17:47:39Z)
To ChatGPT, or not to ChatGPT: That is the question! [78.407861566006]
This study provides a comprehensive and contemporary assessment of the most recent techniques in ChatGPT detection. We have curated a benchmark dataset consisting of prompts from ChatGPT and humans, including diverse questions from medical, open Q&A, and finance domains. Our evaluation results demonstrate that none of the existing methods can effectively detect ChatGPT-generated content.
arXiv Detail & Related papers (2023-04-04T03:04:28Z)
Comparing Abstractive Summaries Generated by ChatGPT to Real Summaries Through Blinded Reviewers and Text Classification Algorithms [0.8339831319589133]
ChatGPT, developed by OpenAI, is a recent addition to the family of language models. We evaluate the performance of ChatGPT on Abstractive Summarization by the means of automated metrics and blinded human reviewers.
arXiv Detail & Related papers (2023-03-30T18:28:33Z)
A Survey on Retrieval-Augmented Text Generation [53.04991859796971]
Retrieval-augmented text generation has remarkable advantages and has achieved state-of-the-art performance in many NLP tasks. It firstly highlights the generic paradigm of retrieval-augmented generation, and then it reviews notable approaches according to different tasks.
arXiv Detail & Related papers (2022-02-02T16:18:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.