Controlling Out-of-Domain Gaps in LLMs for Genre Classification and Generated Text Detection
- URL: http://arxiv.org/abs/2412.20595v1
- Date: Sun, 29 Dec 2024 21:54:39 GMT
- Title: Controlling Out-of-Domain Gaps in LLMs for Genre Classification and Generated Text Detection
- Authors: Dmitri Roussinov, Serge Sharoff, Nadezhda Puchnina,
- Abstract summary: This study demonstrates that the modern generation of Large Language Models (LLMs) suffers from the same out-of-domain (OOD) performance gap observed in prior research on pre-trained Language Models (PLMs)
We introduce a method that controls which predictive indicators are used and which are excluded during classification.
This approach reduces the OOD gap by up to 20 percentage points in a few-shot setup.
- Score: 0.20482269513546458
- License:
- Abstract: This study demonstrates that the modern generation of Large Language Models (LLMs, such as GPT-4) suffers from the same out-of-domain (OOD) performance gap observed in prior research on pre-trained Language Models (PLMs, such as BERT). We demonstrate this across two non-topical classification tasks: 1) genre classification and 2) generated text detection. Our results show that when demonstration examples for In-Context Learning (ICL) come from one domain (e.g., travel) and the system is tested on another domain (e.g., history), classification performance declines significantly. To address this, we introduce a method that controls which predictive indicators are used and which are excluded during classification. For the two tasks studied here, this ensures that topical features are omitted, while the model is guided to focus on stylistic rather than content-based attributes. This approach reduces the OOD gap by up to 20 percentage points in a few-shot setup. Straightforward Chain-of-Thought (CoT) methods, used as the baseline, prove insufficient, while our approach consistently enhances domain transfer performance.
Related papers
- How Useful is Continued Pre-Training for Generative Unsupervised Domain Adaptation? [23.454153602068786]
We evaluate the utility of Continued Pre-Training (CPT) for generative UDA.
Our findings suggest that a implicitly learns the downstream task while predicting masked words informative to that task.
arXiv Detail & Related papers (2024-01-31T00:15:34Z) - ATTA: Anomaly-aware Test-Time Adaptation for Out-of-Distribution
Detection in Segmentation [22.084967085509387]
We propose a dual-level OOD detection framework to handle domain shift and semantic shift jointly.
The first level distinguishes whether domain shift exists in the image by leveraging global low-level features.
The second level identifies pixels with semantic shift by utilizing dense high-level feature maps.
arXiv Detail & Related papers (2023-09-12T06:49:56Z) - Weakly-Supervised Action Localization by Hierarchically-structured
Latent Attention Modeling [19.683714649646603]
Weakly-supervised action localization aims to recognize and localize action instancese in untrimmed videos with only video-level labels.
Most existing models rely on multiple instance learning(MIL), where predictions of unlabeled instances are supervised by classifying labeled bags.
We propose a novel attention-based hierarchically-structured latent model to learn the temporal variations of feature semantics.
arXiv Detail & Related papers (2023-08-19T08:45:49Z) - Rationale-Guided Few-Shot Classification to Detect Abusive Language [5.977278650516324]
We propose RGFS (Rationale-Guided Few-Shot Classification) for abusive language detection.
We introduce two rationale-integrated BERT-based architectures (the RGFS models) and evaluate our systems over five different abusive language datasets.
arXiv Detail & Related papers (2022-11-30T14:47:14Z) - Few-Shot Classification in Unseen Domains by Episodic Meta-Learning
Across Visual Domains [36.98387822136687]
Few-shot classification aims to carry out classification given only few labeled examples for the categories of interest.
In this paper, we present a unique learning framework for domain-generalized few-shot classification.
By advancing meta-learning strategies, our learning framework exploits data across multiple source domains to capture domain-invariant features.
arXiv Detail & Related papers (2021-12-27T06:54:11Z) - Guiding Generative Language Models for Data Augmentation in Few-Shot
Text Classification [59.698811329287174]
We leverage GPT-2 for generating artificial training instances in order to improve classification performance.
Our results show that fine-tuning GPT-2 in a handful of label instances leads to consistent classification improvements.
arXiv Detail & Related papers (2021-11-17T12:10:03Z) - MCDAL: Maximum Classifier Discrepancy for Active Learning [74.73133545019877]
Recent state-of-the-art active learning methods have mostly leveraged Generative Adversarial Networks (GAN) for sample acquisition.
We propose in this paper a novel active learning framework that we call Maximum Discrepancy for Active Learning (MCDAL)
In particular, we utilize two auxiliary classification layers that learn tighter decision boundaries by maximizing the discrepancies among them.
arXiv Detail & Related papers (2021-07-23T06:57:08Z) - Enhancing the Generalization for Intent Classification and Out-of-Domain
Detection in SLU [70.44344060176952]
Intent classification is a major task in spoken language understanding (SLU)
Recent works have shown that using extra data and labels can improve the OOD detection performance.
This paper proposes to train a model with only IND data while supporting both IND intent classification and OOD detection.
arXiv Detail & Related papers (2021-06-28T08:27:38Z) - Category Contrast for Unsupervised Domain Adaptation in Visual Tasks [92.9990560760593]
We propose a novel Category Contrast technique (CaCo) that introduces semantic priors on top of instance discrimination for visual UDA tasks.
CaCo is complementary to existing UDA methods and generalizable to other learning setups such as semi-supervised learning, unsupervised model adaptation, etc.
arXiv Detail & Related papers (2021-06-05T12:51:35Z) - Class-Incremental Domain Adaptation [56.72064953133832]
We introduce a practical Domain Adaptation (DA) paradigm called Class-Incremental Domain Adaptation (CIDA)
Existing DA methods tackle domain-shift but are unsuitable for learning novel target-domain classes.
Our approach yields superior performance as compared to both DA and CI methods in the CIDA paradigm.
arXiv Detail & Related papers (2020-08-04T07:55:03Z) - Cross-domain Detection via Graph-induced Prototype Alignment [114.8952035552862]
We propose a Graph-induced Prototype Alignment (GPA) framework to seek for category-level domain alignment.
In addition, in order to alleviate the negative effect of class-imbalance on domain adaptation, we design a Class-reweighted Contrastive Loss.
Our approach outperforms existing methods with a remarkable margin.
arXiv Detail & Related papers (2020-03-28T17:46:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.