Related papers: Few-shot Text Classification with Dual Contrastive Consistency

Few-shot Text Classification with Dual Contrastive Consistency

URL: http://arxiv.org/abs/2209.15069v1
Date: Thu, 29 Sep 2022 19:26:23 GMT
Title: Few-shot Text Classification with Dual Contrastive Consistency
Authors: Liwen Sun, Jiawei Han
Abstract summary: In this paper, we explore how to utilize pre-trained language model to perform few-shot text classification. We adopt supervised contrastive learning on few labeled data and consistency-regularization on vast unlabeled data.
Score: 31.141350717029358
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we explore how to utilize pre-trained language model to perform few-shot text classification where only a few annotated examples are given for each class. Since using traditional cross-entropy loss to fine-tune language model under this scenario causes serious overfitting and leads to sub-optimal generalization of model, we adopt supervised contrastive learning on few labeled data and consistency-regularization on vast unlabeled data. Moreover, we propose a novel contrastive consistency to further boost model performance and refine sentence representation. After conducting extensive experiments on four datasets, we demonstrate that our model (FTCC) can outperform state-of-the-art methods and has better robustness.

Related papers

EpiCoDe: Boosting Model Performance Beyond Training with Extrapolation and Contrastive Decoding [50.29046178980637]
EpiCoDe is a method that boosts model performance in data-scarcity scenarios without extra training.<n>We show that EpiCoDe consistently outperforms existing methods with significant and robust improvement.
arXiv Detail & Related papers (2025-06-04T02:11:54Z)
Ensembling Finetuned Language Models for Text Classification [55.15643209328513]
Finetuning is a common practice across different communities to adapt pretrained models to particular tasks. ensembles of neural networks are typically used to boost performance and provide reliable uncertainty estimates. We present a metadataset with predictions from five large finetuned models on six datasets and report results of different ensembling strategies.
arXiv Detail & Related papers (2024-10-25T09:15:54Z)
Beyond Coarse-Grained Matching in Video-Text Retrieval [50.799697216533914]
We introduce a new approach for fine-grained evaluation. Our approach can be applied to existing datasets by automatically generating hard negative test captions. Experiments on our fine-grained evaluations demonstrate that this approach enhances a model's ability to understand fine-grained differences.
arXiv Detail & Related papers (2024-10-16T09:42:29Z)
Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images. We identify model weaknesses by testing the model using the counterfactual image dataset. We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z)
Simple-Sampling and Hard-Mixup with Prototypes to Rebalance Contrastive Learning for Text Classification [11.072083437769093]
We propose a novel model named SharpReCL for imbalanced text classification tasks. Our model even outperforms popular large language models across several datasets.
arXiv Detail & Related papers (2024-05-19T11:33:49Z)
Diversified in-domain synthesis with efficient fine-tuning for few-shot classification [64.86872227580866]
Few-shot image classification aims to learn an image classifier using only a small set of labeled examples per class. We propose DISEF, a novel approach which addresses the generalization challenge in few-shot learning using synthetic data. We validate our method in ten different benchmarks, consistently outperforming baselines and establishing a new state-of-the-art for few-shot classification.
arXiv Detail & Related papers (2023-12-05T17:18:09Z)
Mitigating Data Sparsity for Short Text Topic Modeling by Topic-Semantic Contrastive Learning [19.7066703371736]
We propose a novel short text topic modeling framework, Topic-Semantic Contrastive Topic Model (TSCTM) Our TSCTM outperforms state-of-the-art baselines regardless of the data augmentation availability, producing high-quality topics and topic distributions.
arXiv Detail & Related papers (2022-11-23T11:33:43Z)
Revisiting Self-Training for Few-Shot Learning of Language Model [61.173976954360334]
Unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model. In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM.
arXiv Detail & Related papers (2021-10-04T08:51:36Z)
Few-shot learning through contextual data augmentation [74.20290390065475]
Machine translation models need to adapt to new data to maintain their performance over time. We show that adaptation on the scale of one to five examples is possible. Our model reports better accuracy scores than a reference system trained with on average 313 parallel examples.
arXiv Detail & Related papers (2021-03-31T09:05:43Z)
Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning [23.00300794016583]
State-of-the-art natural language understanding classification models follow two-stages. We propose a supervised contrastive learning (SCL) objective for the fine-tuning stage. Our proposed fine-tuning objective leads to models that are more robust to different levels of noise in the fine-tuning training data.
arXiv Detail & Related papers (2020-11-03T01:10:39Z)
Evaluating Text Coherence at Sentence and Paragraph Levels [17.99797111176988]
We investigate the adaptation of existing sentence ordering methods to a paragraph ordering task. We also compare the learnability and robustness of existing models by artificially creating mini datasets and noisy datasets. We conclude that the recurrent graph neural network-based model is an optimal choice for coherence modeling.
arXiv Detail & Related papers (2020-06-05T03:31:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.