Related papers: Decoupling Content and Expression: Two-Dimensional Detection of AI-Generated Text

Decoupling Content and Expression: Two-Dimensional Detection of AI-Generated Text

URL: http://arxiv.org/abs/2503.00258v1
Date: Sat, 01 Mar 2025 00:19:13 GMT
Title: Decoupling Content and Expression: Two-Dimensional Detection of AI-Generated Text
Authors: Guangsheng Bao, Lihua Rong, Yanbin Zhao, Qiji Zhou, Yue Zhang,
Abstract summary: We present HART, a hierarchical framework of AI risk levels, each corresponding to a detection task.<n>Our findings show that content is resistant to surface-level changes, which can serve as a key feature for detection.<n> Experiments demonstrate that 2D method significantly outperforms existing detectors, achieving an AUROC improvement from 0.705 to 0.849 for level-2 detection and from 0.807 to 0.886 for RAID.
Score: 18.809271485302897
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The wide usage of LLMs raises critical requirements on detecting AI participation in texts. Existing studies investigate these detections in scattered contexts, leaving a systematic and unified approach unexplored. In this paper, we present HART, a hierarchical framework of AI risk levels, each corresponding to a detection task. To address these tasks, we propose a novel 2D Detection Method, decoupling a text into content and language expression. Our findings show that content is resistant to surface-level changes, which can serve as a key feature for detection. Experiments demonstrate that 2D method significantly outperforms existing detectors, achieving an AUROC improvement from 0.705 to 0.849 for level-2 detection and from 0.807 to 0.886 for RAID. We release our data and code at https://github.com/baoguangsheng/truth-mirror.

Related papers

HACo-Det: A Study Towards Fine-Grained Machine-Generated Text Detection under Human-AI Coauthoring [14.887491317701997]
This paper explores the possibility of fine-grained MGT detection under human-AI coauthoring.<n>We suggest fine-grained detectors can pave pathways toward coauthored text detection with a numeric AI ratio.<n> Empirical results show that metric-based methods struggle to conduct fine-grained detection with a 0.462 average F1 score.
arXiv Detail & Related papers (2025-06-03T14:52:44Z)
AuthorMist: Evading AI Text Detectors with Reinforcement Learning [4.806579822134391]
AuthorMist is a novel reinforcement learning-based system to transform AI-generated text into human-like writing. We show that AuthorMist effectively reduces the detectability of AI-generated text while preserving the original meaning.
arXiv Detail & Related papers (2025-03-10T12:41:05Z)
DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios [38.952481877244644]
We present a new benchmark, DetectRL, highlighting that even state-of-the-art (SOTA) detection techniques still underperformed in this task. Our development of DetectRL reveals the strengths and limitations of current SOTA detectors. We believe DetectRL could serve as an effective benchmark for assessing detectors in real-world scenarios.
arXiv Detail & Related papers (2024-10-31T09:01:25Z)
DeTeCtive: Detecting AI-generated Text via Multi-Level Contrastive Learning [24.99797253885887]
We argue that the key to accomplishing this task lies in distinguishing writing styles of different authors. We propose DeTeCtive, a multi-task auxiliary, multi-level contrastive learning framework. Our method is compatible with a range of text encoders.
arXiv Detail & Related papers (2024-10-28T12:34:49Z)
ESPERANTO: Evaluating Synthesized Phrases to Enhance Robustness in AI Detection for Text Origination [1.8418334324753884]
This paper introduces back-translation as a novel technique for evading detection. We present a model that combines these back-translated texts to produce a manipulated version of the original AI-generated text. We evaluate this technique on nine AI detectors, including six open-source and three proprietary systems.
arXiv Detail & Related papers (2024-09-22T01:13:22Z)
Learning to Rewrite: Generalized LLM-Generated Text Detection [19.9477991969521]
Large language models (LLMs) present significant risks when used to generate non-factual content and spread disinformation at scale.<n>We introduce Learning2Rewrite, a novel framework for detecting AI-generated text with exceptional generalization to unseen domains.
arXiv Detail & Related papers (2024-08-08T05:53:39Z)
Spotting AI's Touch: Identifying LLM-Paraphrased Spans in Text [61.22649031769564]
We propose a novel framework, paraphrased text span detection (PTD) PTD aims to identify paraphrased text spans within a text. We construct a dedicated dataset, PASTED, for paraphrased text span detection.
arXiv Detail & Related papers (2024-05-21T11:22:27Z)
Assaying on the Robustness of Zero-Shot Machine-Generated Text Detectors [57.7003399760813]
We explore advanced Large Language Models (LLMs) and their specialized variants, contributing to this field in several ways. We uncover a significant correlation between topics and detection performance. These investigations shed light on the adaptability and robustness of these detection methods across diverse topics.
arXiv Detail & Related papers (2023-12-20T10:53:53Z)
SeqXGPT: Sentence-Level AI-Generated Text Detection [62.3792779440284]
We introduce a sentence-level detection challenge by synthesizing documents polished with large language models (LLMs) We then propose textbfSequence textbfX (Check) textbfGPT, a novel method that utilizes log probability lists from white-box LLMs as features for sentence-level AIGT detection.
arXiv Detail & Related papers (2023-10-13T07:18:53Z)
OUTFOX: LLM-Generated Essay Detection Through In-Context Learning with Adversarially Generated Examples [44.118047780553006]
OUTFOX is a framework that improves the robustness of LLM-generated-text detectors by allowing both the detector and the attacker to consider each other's output. Experiments show that the proposed detector improves the detection performance on the attacker-generated texts by up to +41.3 points F1-score. The detector shows a state-of-the-art detection performance: up to 96.9 points F1-score, beating existing detectors on non-attacked texts.
arXiv Detail & Related papers (2023-07-21T17:40:47Z)
RADAR: Robust AI-Text Detection via Adversarial Learning [69.5883095262619]
RADAR is based on adversarial training of a paraphraser and a detector. The paraphraser's goal is to generate realistic content to evade AI-text detection. RADAR uses the feedback from the detector to update the paraphraser, and vice versa.
arXiv Detail & Related papers (2023-07-07T21:13:27Z)
On the Possibilities of AI-Generated Text Detection [76.55825911221434]
We argue that as machine-generated text approximates human-like quality, the sample size needed for detection bounds increases. We test various state-of-the-art text generators, including GPT-2, GPT-3.5-Turbo, Llama, Llama-2-13B-Chat-HF, and Llama-2-70B-Chat-HF, against detectors, including oBERTa-Large/Base-Detector, GPTZero.
arXiv Detail & Related papers (2023-04-10T17:47:39Z)
Can AI-Generated Text be Reliably Detected? [50.95804851595018]
Large Language Models (LLMs) perform impressively well in various applications.<n>The potential for misuse of these models in activities such as plagiarism, generating fake news, and spamming has raised concern about their responsible use.<n>We stress-test the robustness of these AI text detectors in the presence of an attacker.
arXiv Detail & Related papers (2023-03-17T17:53:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.