Related papers: ConDA: Contrastive Domain Adaptation for AI-generated Text Detection

ConDA: Contrastive Domain Adaptation for AI-generated Text Detection

URL: http://arxiv.org/abs/2309.03992v2
Date: Wed, 20 Sep 2023 22:17:30 GMT
Title: ConDA: Contrastive Domain Adaptation for AI-generated Text Detection
Authors: Amrita Bhattacharjee, Tharindu Kumarage, Raha Moraffah, Huan Liu
Abstract summary: Large language models (LLMs) are increasingly being used for generating text in news articles. Given the potential malicious nature in which these LLMs can be used to generate disinformation at scale, it is important to build effective detectors for such AI-generated text. In this work we tackle this data problem, in detecting AI-generated news text, and frame the problem as an unsupervised domain adaptation task.
Score: 17.8787054992985
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) are increasingly being used for generating text in a variety of use cases, including journalistic news articles. Given the potential malicious nature in which these LLMs can be used to generate disinformation at scale, it is important to build effective detectors for such AI-generated text. Given the surge in development of new LLMs, acquiring labeled training data for supervised detectors is a bottleneck. However, there might be plenty of unlabeled text data available, without information on which generator it came from. In this work we tackle this data problem, in detecting AI-generated news text, and frame the problem as an unsupervised domain adaptation task. Here the domains are the different text generators, i.e. LLMs, and we assume we have access to only the labeled source data and unlabeled target data. We develop a Contrastive Domain Adaptation framework, called ConDA, that blends standard domain adaptation techniques with the representation power of contrastive learning to learn domain invariant representations that are effective for the final unsupervised detection task. Our experiments demonstrate the effectiveness of our framework, resulting in average performance gains of 31.7% from the best performing baselines, and within 0.8% margin of a fully supervised detector. All our code and data is available at https://github.com/AmritaBh/ConDA-gen-text-detection.

Related papers

DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios [38.952481877244644]
We present a new benchmark, DetectRL, highlighting that even state-of-the-art (SOTA) detection techniques still underperformed in this task. Our development of DetectRL reveals the strengths and limitations of current SOTA detectors. We believe DetectRL could serve as an effective benchmark for assessing detectors in real-world scenarios.
arXiv Detail & Related papers (2024-10-31T09:01:25Z)
Robust AI-Generated Text Detection by Restricted Embeddings [6.745955674138081]
We focus on robustness of detectors of AI-generated text, namely their ability to transfer to unseen generators or semantic domains. We show that clearing out harmful linear subspaces helps to train a robust classifier, ignoring domain-specific spurious features. Our best approaches for head-wise and coordinate-based subspace removal increase the mean out-of-distribution (OOD) classification score by up to 9% and 14% in particular.
arXiv Detail & Related papers (2024-10-10T16:58:42Z)
ESPERANTO: Evaluating Synthesized Phrases to Enhance Robustness in AI Detection for Text Origination [1.8418334324753884]
This paper introduces back-translation as a novel technique for evading detection. We present a model that combines these back-translated texts to produce a manipulated version of the original AI-generated text. We evaluate this technique on nine AI detectors, including six open-source and three proprietary systems.
arXiv Detail & Related papers (2024-09-22T01:13:22Z)
Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore [51.65730053591696]
We propose a simple yet effective black-box zero-shot detection approach based on the observation that human-written texts typically contain more grammatical errors than LLM-generated texts. Experimental results show that our method outperforms current state-of-the-art (SOTA) zero-shot and supervised methods.
arXiv Detail & Related papers (2024-05-07T12:57:01Z)
EAGLE: A Domain Generalization Framework for AI-generated Text Detection [15.254775341371364]
We propose a domain generalization framework for the detection of AI-generated text from unseen target generators. We demonstrate how our framework effectively achieves impressive performance in detecting text generated by unseen target generators.
arXiv Detail & Related papers (2024-03-23T02:44:20Z)
DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text [26.02072055825044]
We introduce two novel zero-shot methods for detecting machine-generated text by leveraging the log rank information. One is called DetectLLM-LRR, which is fast and efficient, and the other is called DetectLLM-NPR, which is more accurate, but slower due to the need for perturbations. Our experiments on three datasets and seven language models show that our proposed methods improve over the state of the art by 3.9 and 1.75 AUROC points absolute.
arXiv Detail & Related papers (2023-05-23T11:18:30Z)
MAGE: Machine-generated Text Detection in the Wild [82.70561073277801]
Large language models (LLMs) have achieved human-level text generation, emphasizing the need for effective AI-generated text detection. We build a comprehensive testbed by gathering texts from diverse human writings and texts generated by different LLMs. Despite challenges, the top-performing detector can identify 86.54% out-of-domain texts generated by a new LLM, indicating the feasibility for application scenarios.
arXiv Detail & Related papers (2023-05-22T17:13:29Z)
Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense [56.077252790310176]
We present a paraphrase generation model (DIPPER) that can paraphrase paragraphs, condition on surrounding context, and control lexical diversity and content reordering. Using DIPPER to paraphrase text generated by three large language models (including GPT3.5-davinci-003) successfully evades several detectors, including watermarking. We introduce a simple defense that relies on retrieving semantically-similar generations and must be maintained by a language model API provider.
arXiv Detail & Related papers (2023-03-23T16:29:27Z)
Can AI-Generated Text be Reliably Detected? [54.670136179857344]
Unregulated use of LLMs can potentially lead to malicious consequences such as plagiarism, generating fake news, spamming, etc. Recent works attempt to tackle this problem either using certain model signatures present in the generated text outputs or by applying watermarking techniques. In this paper, we show that these detectors are not reliable in practical scenarios.
arXiv Detail & Related papers (2023-03-17T17:53:19Z)
Instance Relation Graph Guided Source-Free Domain Adaptive Object Detection [79.89082006155135]
Unsupervised Domain Adaptation (UDA) is an effective approach to tackle the issue of domain shift. UDA methods try to align the source and target representations to improve the generalization on the target domain. The Source-Free Adaptation Domain (SFDA) setting aims to alleviate these concerns by adapting a source-trained model for the target domain without requiring access to the source data.
arXiv Detail & Related papers (2022-03-29T17:50:43Z)
A Free Lunch for Unsupervised Domain Adaptive Object Detection without Source Data [69.091485888121]
Unsupervised domain adaptation assumes that source and target domain data are freely available and usually trained together to reduce the domain gap. We propose a source data-free domain adaptive object detection (SFOD) framework via modeling it into a problem of learning with noisy labels.
arXiv Detail & Related papers (2020-12-10T01:42:35Z)
Partially-Aligned Data-to-Text Generation with Distant Supervision [69.15410325679635]
We propose a new generation task called Partially-Aligned Data-to-Text Generation (PADTG) It is more practical since it utilizes automatically annotated data for training and thus considerably expands the application domains. Our framework outperforms all baseline models as well as verify the feasibility of utilizing partially-aligned data.
arXiv Detail & Related papers (2020-10-03T03:18:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.