Testing of Detection Tools for AI-Generated Text
- URL: http://arxiv.org/abs/2306.15666v2
- Date: Mon, 10 Jul 2023 16:14:33 GMT
- Title: Testing of Detection Tools for AI-Generated Text
- Authors: Debora Weber-Wulff (University of Applied Sciences HTW Berlin,
Germany), Alla Anohina-Naumeca (Riga Technical University, Latvia), Sonja
Bjelobaba (Uppsala University, Sweden), Tom\'a\v{s} Folt\'ynek (Masaryk
University, Czechia), Jean Guerrero-Dib (Universidad de Monterrey, Mexico),
Olumide Popoola (Queen Mary University of London, UK), Petr \v{S}igut
(Masaryk University, Czechia), Lorna Waddington (University of Leeds, UK)
- Abstract summary: The paper examines the functionality of detection tools for artificial intelligence generated text.
It evaluates them based on accuracy and error type analysis.
The research covers 12 publicly available tools and two commercial systems.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in generative pre-trained transformer large language models
have emphasised the potential risks of unfair use of artificial intelligence
(AI) generated content in an academic environment and intensified efforts in
searching for solutions to detect such content. The paper examines the general
functionality of detection tools for artificial intelligence generated text and
evaluates them based on accuracy and error type analysis. Specifically, the
study seeks to answer research questions about whether existing detection tools
can reliably differentiate between human-written text and ChatGPT-generated
text, and whether machine translation and content obfuscation techniques affect
the detection of AI-generated text. The research covers 12 publicly available
tools and two commercial systems (Turnitin and PlagiarismCheck) that are widely
used in the academic setting. The researchers conclude that the available
detection tools are neither accurate nor reliable and have a main bias towards
classifying the output as human-written rather than detecting AI-generated
text. Furthermore, content obfuscation techniques significantly worsen the
performance of tools. The study makes several significant contributions. First,
it summarises up-to-date similar scientific and non-scientific efforts in the
field. Second, it presents the result of one of the most comprehensive tests
conducted so far, based on a rigorous research methodology, an original
document set, and a broad coverage of tools. Third, it discusses the
implications and drawbacks of using detection tools for AI-generated text in
academic settings.
Related papers
- ESPERANTO: Evaluating Synthesized Phrases to Enhance Robustness in AI Detection for Text Origination [1.8418334324753884]
This paper introduces back-translation as a novel technique for evading detection.
We present a model that combines these back-translated texts to produce a manipulated version of the original AI-generated text.
We evaluate this technique on nine AI detectors, including six open-source and three proprietary systems.
arXiv Detail & Related papers (2024-09-22T01:13:22Z) - Detecting AI-Generated Text: Factors Influencing Detectability with Current Methods [13.14749943120523]
Knowing whether a text was produced by human or artificial intelligence (AI) is important to determining its trustworthiness.
State-of-the art approaches to AIGT detection include watermarking, statistical and stylistic analysis, and machine learning classification.
We aim to provide insight into the salient factors that combine to determine how "detectable" AIGT text is under different scenarios.
arXiv Detail & Related papers (2024-06-21T18:31:49Z) - Navigating the Shadows: Unveiling Effective Disturbances for Modern AI Content Detectors [24.954755569786396]
AI-text detection has emerged to distinguish between human and machine-generated content.
Recent research indicates that these detection systems often lack robustness and struggle to effectively differentiate perturbed texts.
Our work simulates real-world scenarios in both informal and professional writing, exploring the out-of-the-box performance of current detectors.
arXiv Detail & Related papers (2024-06-13T08:37:01Z) - Hidding the Ghostwriters: An Adversarial Evaluation of AI-Generated
Student Essay Detection [29.433764586753956]
Large language models (LLMs) have exhibited remarkable capabilities in text generation tasks.
The utilization of these models carries inherent risks, including but not limited to plagiarism, the dissemination of fake news, and issues in educational exercises.
This paper aims to bridge this gap by constructing AIG-ASAP, an AI-generated student essay dataset.
arXiv Detail & Related papers (2024-02-01T08:11:56Z) - Assaying on the Robustness of Zero-Shot Machine-Generated Text Detectors [57.7003399760813]
We explore advanced Large Language Models (LLMs) and their specialized variants, contributing to this field in several ways.
We uncover a significant correlation between topics and detection performance.
These investigations shed light on the adaptability and robustness of these detection methods across diverse topics.
arXiv Detail & Related papers (2023-12-20T10:53:53Z) - Towards Possibilities & Impossibilities of AI-generated Text Detection:
A Survey [97.33926242130732]
Large Language Models (LLMs) have revolutionized the domain of natural language processing (NLP) with remarkable capabilities of generating human-like text responses.
Despite these advancements, several works in the existing literature have raised serious concerns about the potential misuse of LLMs.
To address these concerns, a consensus among the research community is to develop algorithmic solutions to detect AI-generated text.
arXiv Detail & Related papers (2023-10-23T18:11:32Z) - Watermarking Conditional Text Generation for AI Detection: Unveiling
Challenges and a Semantic-Aware Watermark Remedy [52.765898203824975]
We introduce a semantic-aware watermarking algorithm that considers the characteristics of conditional text generation and the input context.
Experimental results demonstrate that our proposed method yields substantial improvements across various text generation models.
arXiv Detail & Related papers (2023-07-25T20:24:22Z) - FacTool: Factuality Detection in Generative AI -- A Tool Augmented
Framework for Multi-Task and Multi-Domain Scenarios [87.12753459582116]
A wider range of tasks now face an increasing risk of containing factual errors when handled by generative models.
We propose FacTool, a task and domain agnostic framework for detecting factual errors of texts generated by large language models.
arXiv Detail & Related papers (2023-07-25T14:20:51Z) - MAGE: Machine-generated Text Detection in the Wild [82.70561073277801]
Large language models (LLMs) have achieved human-level text generation, emphasizing the need for effective AI-generated text detection.
We build a comprehensive testbed by gathering texts from diverse human writings and texts generated by different LLMs.
Despite challenges, the top-performing detector can identify 86.54% out-of-domain texts generated by a new LLM, indicating the feasibility for application scenarios.
arXiv Detail & Related papers (2023-05-22T17:13:29Z) - On the Possibilities of AI-Generated Text Detection [76.55825911221434]
We argue that as machine-generated text approximates human-like quality, the sample size needed for detection bounds increases.
We test various state-of-the-art text generators, including GPT-2, GPT-3.5-Turbo, Llama, Llama-2-13B-Chat-HF, and Llama-2-70B-Chat-HF, against detectors, including oBERTa-Large/Base-Detector, GPTZero.
arXiv Detail & Related papers (2023-04-10T17:47:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.