ConDA: Contrastive Domain Adaptation for AI-generated Text Detection
- URL: http://arxiv.org/abs/2309.03992v2
- Date: Wed, 20 Sep 2023 22:17:30 GMT
- Title: ConDA: Contrastive Domain Adaptation for AI-generated Text Detection
- Authors: Amrita Bhattacharjee, Tharindu Kumarage, Raha Moraffah, Huan Liu
- Abstract summary: Large language models (LLMs) are increasingly being used for generating text in news articles.
Given the potential malicious nature in which these LLMs can be used to generate disinformation at scale, it is important to build effective detectors for such AI-generated text.
In this work we tackle this data problem, in detecting AI-generated news text, and frame the problem as an unsupervised domain adaptation task.
- Score: 17.8787054992985
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) are increasingly being used for generating text
in a variety of use cases, including journalistic news articles. Given the
potential malicious nature in which these LLMs can be used to generate
disinformation at scale, it is important to build effective detectors for such
AI-generated text. Given the surge in development of new LLMs, acquiring
labeled training data for supervised detectors is a bottleneck. However, there
might be plenty of unlabeled text data available, without information on which
generator it came from. In this work we tackle this data problem, in detecting
AI-generated news text, and frame the problem as an unsupervised domain
adaptation task. Here the domains are the different text generators, i.e. LLMs,
and we assume we have access to only the labeled source data and unlabeled
target data. We develop a Contrastive Domain Adaptation framework, called
ConDA, that blends standard domain adaptation techniques with the
representation power of contrastive learning to learn domain invariant
representations that are effective for the final unsupervised detection task.
Our experiments demonstrate the effectiveness of our framework, resulting in
average performance gains of 31.7% from the best performing baselines, and
within 0.8% margin of a fully supervised detector. All our code and data is
available at https://github.com/AmritaBh/ConDA-gen-text-detection.
Related papers
- Detecting LLM-Generated Text with Performance Guarantees [13.29284903739996]
Large language models (LLMs) such as GPT, Claude, Gemini, and Grok have been deeply integrated into our daily life.<n>They now support a wide range of tasks -- from dialogue and email drafting to assisting with teaching and coding.<n>Their ability to produce highly human-like text raises serious concerns, including the spread of fake news.
arXiv Detail & Related papers (2026-01-10T14:52:45Z) - Sure! Here's a short and concise title for your paper: "Contamination in Generated Text Detection Benchmarks" [6.898843708099658]
Large language models are increasingly used for many applications.<n>To prevent illicit use, it is desirable to be able to detect AI-generated text.<n>Training and evaluation of such detectors critically depend on suitable benchmark datasets.
arXiv Detail & Related papers (2025-11-12T11:02:39Z) - Diversity Boosts AI-Generated Text Detection [51.56484100374058]
DivEye is a novel framework that captures how unpredictability fluctuates across a text using surprisal-based features.<n>Our method outperforms existing zero-shot detectors by up to 33.2% and achieves competitive performance with fine-tuned baselines.
arXiv Detail & Related papers (2025-09-23T10:21:22Z) - DetectAnyLLM: Towards Generalizable and Robust Detection of Machine-Generated Text Across Domains and Models [60.713908578319256]
We propose Direct Discrepancy Learning (DDL) to optimize the detector with task-oriented knowledge.<n>Built upon this, we introduce DetectAnyLLM, a unified detection framework that achieves state-of-the-art MGTD performance.<n>MIRAGE samples human-written texts from 10 corpora across 5 text-domains, which are then re-generated or revised using 17 cutting-edge LLMs.
arXiv Detail & Related papers (2025-09-15T10:59:57Z) - RepreGuard: Detecting LLM-Generated Text by Revealing Hidden Representation Patterns [50.401907401444404]
Large language models (LLMs) are crucial for preventing misuse and building trustworthy AI systems.<n>We propose RepreGuard, an efficient statistics-based detection method.<n> Experimental results show that RepreGuard outperforms all baselines with average 94.92% AUROC on both in-distribution (ID) and OOD scenarios.
arXiv Detail & Related papers (2025-08-18T17:59:15Z) - Group-Adaptive Threshold Optimization for Robust AI-Generated Text Detection [58.419940585826744]
We introduce FairOPT, an algorithm for group-specific threshold optimization for probabilistic AI-text detectors.<n>We partitioned data into subgroups based on attributes (e.g., text length and writing style) and implemented FairOPT to learn decision thresholds for each group to reduce discrepancy.<n>Our framework paves the way for more robust classification in AI-generated content detection via post-processing.
arXiv Detail & Related papers (2025-02-06T21:58:48Z) - DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios [38.952481877244644]
We present a new benchmark, DetectRL, highlighting that even state-of-the-art (SOTA) detection techniques still underperformed in this task.
Our development of DetectRL reveals the strengths and limitations of current SOTA detectors.
We believe DetectRL could serve as an effective benchmark for assessing detectors in real-world scenarios.
arXiv Detail & Related papers (2024-10-31T09:01:25Z) - Robust AI-Generated Text Detection by Restricted Embeddings [6.745955674138081]
We focus on robustness of detectors of AI-generated text, namely their ability to transfer to unseen generators or semantic domains.
We show that clearing out harmful linear subspaces helps to train a robust classifier, ignoring domain-specific spurious features.
Our best approaches for head-wise and coordinate-based subspace removal increase the mean out-of-distribution (OOD) classification score by up to 9% and 14% in particular.
arXiv Detail & Related papers (2024-10-10T16:58:42Z) - ESPERANTO: Evaluating Synthesized Phrases to Enhance Robustness in AI Detection for Text Origination [1.8418334324753884]
This paper introduces back-translation as a novel technique for evading detection.
We present a model that combines these back-translated texts to produce a manipulated version of the original AI-generated text.
We evaluate this technique on nine AI detectors, including six open-source and three proprietary systems.
arXiv Detail & Related papers (2024-09-22T01:13:22Z) - Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore [51.65730053591696]
We propose a simple yet effective black-box zero-shot detection approach based on the observation that human-written texts typically contain more grammatical errors than LLM-generated texts.
Experimental results show that our method outperforms current state-of-the-art (SOTA) zero-shot and supervised methods.
arXiv Detail & Related papers (2024-05-07T12:57:01Z) - EAGLE: A Domain Generalization Framework for AI-generated Text Detection [15.254775341371364]
We propose a domain generalization framework for the detection of AI-generated text from unseen target generators.
We demonstrate how our framework effectively achieves impressive performance in detecting text generated by unseen target generators.
arXiv Detail & Related papers (2024-03-23T02:44:20Z) - DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of
Machine-Generated Text [26.02072055825044]
We introduce two novel zero-shot methods for detecting machine-generated text by leveraging the log rank information.
One is called DetectLLM-LRR, which is fast and efficient, and the other is called DetectLLM-NPR, which is more accurate, but slower due to the need for perturbations.
Our experiments on three datasets and seven language models show that our proposed methods improve over the state of the art by 3.9 and 1.75 AUROC points absolute.
arXiv Detail & Related papers (2023-05-23T11:18:30Z) - MAGE: Machine-generated Text Detection in the Wild [82.70561073277801]
Large language models (LLMs) have achieved human-level text generation, emphasizing the need for effective AI-generated text detection.
We build a comprehensive testbed by gathering texts from diverse human writings and texts generated by different LLMs.
Despite challenges, the top-performing detector can identify 86.54% out-of-domain texts generated by a new LLM, indicating the feasibility for application scenarios.
arXiv Detail & Related papers (2023-05-22T17:13:29Z) - Paraphrasing evades detectors of AI-generated text, but retrieval is an
effective defense [56.077252790310176]
We present a paraphrase generation model (DIPPER) that can paraphrase paragraphs, condition on surrounding context, and control lexical diversity and content reordering.
Using DIPPER to paraphrase text generated by three large language models (including GPT3.5-davinci-003) successfully evades several detectors, including watermarking.
We introduce a simple defense that relies on retrieving semantically-similar generations and must be maintained by a language model API provider.
arXiv Detail & Related papers (2023-03-23T16:29:27Z) - Can AI-Generated Text be Reliably Detected? [54.670136179857344]
Unregulated use of LLMs can potentially lead to malicious consequences such as plagiarism, generating fake news, spamming, etc.
Recent works attempt to tackle this problem either using certain model signatures present in the generated text outputs or by applying watermarking techniques.
In this paper, we show that these detectors are not reliable in practical scenarios.
arXiv Detail & Related papers (2023-03-17T17:53:19Z) - Instance Relation Graph Guided Source-Free Domain Adaptive Object
Detection [79.89082006155135]
Unsupervised Domain Adaptation (UDA) is an effective approach to tackle the issue of domain shift.
UDA methods try to align the source and target representations to improve the generalization on the target domain.
The Source-Free Adaptation Domain (SFDA) setting aims to alleviate these concerns by adapting a source-trained model for the target domain without requiring access to the source data.
arXiv Detail & Related papers (2022-03-29T17:50:43Z) - A Free Lunch for Unsupervised Domain Adaptive Object Detection without
Source Data [69.091485888121]
Unsupervised domain adaptation assumes that source and target domain data are freely available and usually trained together to reduce the domain gap.
We propose a source data-free domain adaptive object detection (SFOD) framework via modeling it into a problem of learning with noisy labels.
arXiv Detail & Related papers (2020-12-10T01:42:35Z) - Partially-Aligned Data-to-Text Generation with Distant Supervision [69.15410325679635]
We propose a new generation task called Partially-Aligned Data-to-Text Generation (PADTG)
It is more practical since it utilizes automatically annotated data for training and thus considerably expands the application domains.
Our framework outperforms all baseline models as well as verify the feasibility of utilizing partially-aligned data.
arXiv Detail & Related papers (2020-10-03T03:18:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.