ConDA: Contrastive Domain Adaptation for AI-generated Text Detection
- URL: http://arxiv.org/abs/2309.03992v2
- Date: Wed, 20 Sep 2023 22:17:30 GMT
- Title: ConDA: Contrastive Domain Adaptation for AI-generated Text Detection
- Authors: Amrita Bhattacharjee, Tharindu Kumarage, Raha Moraffah, Huan Liu
- Abstract summary: Large language models (LLMs) are increasingly being used for generating text in news articles.
Given the potential malicious nature in which these LLMs can be used to generate disinformation at scale, it is important to build effective detectors for such AI-generated text.
In this work we tackle this data problem, in detecting AI-generated news text, and frame the problem as an unsupervised domain adaptation task.
- Score: 17.8787054992985
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) are increasingly being used for generating text
in a variety of use cases, including journalistic news articles. Given the
potential malicious nature in which these LLMs can be used to generate
disinformation at scale, it is important to build effective detectors for such
AI-generated text. Given the surge in development of new LLMs, acquiring
labeled training data for supervised detectors is a bottleneck. However, there
might be plenty of unlabeled text data available, without information on which
generator it came from. In this work we tackle this data problem, in detecting
AI-generated news text, and frame the problem as an unsupervised domain
adaptation task. Here the domains are the different text generators, i.e. LLMs,
and we assume we have access to only the labeled source data and unlabeled
target data. We develop a Contrastive Domain Adaptation framework, called
ConDA, that blends standard domain adaptation techniques with the
representation power of contrastive learning to learn domain invariant
representations that are effective for the final unsupervised detection task.
Our experiments demonstrate the effectiveness of our framework, resulting in
average performance gains of 31.7% from the best performing baselines, and
within 0.8% margin of a fully supervised detector. All our code and data is
available at https://github.com/AmritaBh/ConDA-gen-text-detection.
Related papers
- DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios [38.952481877244644]
We present a new benchmark, DetectRL, highlighting that even state-of-the-art (SOTA) detection techniques still underperformed in this task.
Our development of DetectRL reveals the strengths and limitations of current SOTA detectors.
We believe DetectRL could serve as an effective benchmark for assessing detectors in real-world scenarios.
arXiv Detail & Related papers (2024-10-31T09:01:25Z) - Robust AI-Generated Text Detection by Restricted Embeddings [6.745955674138081]
We focus on robustness of detectors of AI-generated text, namely their ability to transfer to unseen generators or semantic domains.
We show that clearing out harmful linear subspaces helps to train a robust classifier, ignoring domain-specific spurious features.
Our best approaches for head-wise and coordinate-based subspace removal increase the mean out-of-distribution (OOD) classification score by up to 9% and 14% in particular.
arXiv Detail & Related papers (2024-10-10T16:58:42Z) - EAGLE: A Domain Generalization Framework for AI-generated Text Detection [15.254775341371364]
We propose a domain generalization framework for the detection of AI-generated text from unseen target generators.
We demonstrate how our framework effectively achieves impressive performance in detecting text generated by unseen target generators.
arXiv Detail & Related papers (2024-03-23T02:44:20Z) - DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of
Machine-Generated Text [26.02072055825044]
We introduce two novel zero-shot methods for detecting machine-generated text by leveraging the log rank information.
One is called DetectLLM-LRR, which is fast and efficient, and the other is called DetectLLM-NPR, which is more accurate, but slower due to the need for perturbations.
Our experiments on three datasets and seven language models show that our proposed methods improve over the state of the art by 3.9 and 1.75 AUROC points absolute.
arXiv Detail & Related papers (2023-05-23T11:18:30Z) - MAGE: Machine-generated Text Detection in the Wild [82.70561073277801]
Large language models (LLMs) have achieved human-level text generation, emphasizing the need for effective AI-generated text detection.
We build a comprehensive testbed by gathering texts from diverse human writings and texts generated by different LLMs.
Despite challenges, the top-performing detector can identify 86.54% out-of-domain texts generated by a new LLM, indicating the feasibility for application scenarios.
arXiv Detail & Related papers (2023-05-22T17:13:29Z) - Paraphrasing evades detectors of AI-generated text, but retrieval is an
effective defense [56.077252790310176]
We present a paraphrase generation model (DIPPER) that can paraphrase paragraphs, condition on surrounding context, and control lexical diversity and content reordering.
Using DIPPER to paraphrase text generated by three large language models (including GPT3.5-davinci-003) successfully evades several detectors, including watermarking.
We introduce a simple defense that relies on retrieving semantically-similar generations and must be maintained by a language model API provider.
arXiv Detail & Related papers (2023-03-23T16:29:27Z) - Can AI-Generated Text be Reliably Detected? [54.670136179857344]
Unregulated use of LLMs can potentially lead to malicious consequences such as plagiarism, generating fake news, spamming, etc.
Recent works attempt to tackle this problem either using certain model signatures present in the generated text outputs or by applying watermarking techniques.
In this paper, we show that these detectors are not reliable in practical scenarios.
arXiv Detail & Related papers (2023-03-17T17:53:19Z) - Instance Relation Graph Guided Source-Free Domain Adaptive Object
Detection [79.89082006155135]
Unsupervised Domain Adaptation (UDA) is an effective approach to tackle the issue of domain shift.
UDA methods try to align the source and target representations to improve the generalization on the target domain.
The Source-Free Adaptation Domain (SFDA) setting aims to alleviate these concerns by adapting a source-trained model for the target domain without requiring access to the source data.
arXiv Detail & Related papers (2022-03-29T17:50:43Z) - A Free Lunch for Unsupervised Domain Adaptive Object Detection without
Source Data [69.091485888121]
Unsupervised domain adaptation assumes that source and target domain data are freely available and usually trained together to reduce the domain gap.
We propose a source data-free domain adaptive object detection (SFOD) framework via modeling it into a problem of learning with noisy labels.
arXiv Detail & Related papers (2020-12-10T01:42:35Z) - Partially-Aligned Data-to-Text Generation with Distant Supervision [69.15410325679635]
We propose a new generation task called Partially-Aligned Data-to-Text Generation (PADTG)
It is more practical since it utilizes automatically annotated data for training and thus considerably expands the application domains.
Our framework outperforms all baseline models as well as verify the feasibility of utilizing partially-aligned data.
arXiv Detail & Related papers (2020-10-03T03:18:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.