ConEntail: An Entailment-based Framework for Universal Zero and Few Shot
Classification with Supervised Contrastive Pretraining
- URL: http://arxiv.org/abs/2210.07587v1
- Date: Fri, 14 Oct 2022 07:37:27 GMT
- Title: ConEntail: An Entailment-based Framework for Universal Zero and Few Shot
Classification with Supervised Contrastive Pretraining
- Authors: Haoran Zhang, Aysa Xuemo Fan and Rui Zhang
- Abstract summary: ConEntail is a framework for universal zero and few shot classification with supervised contrastive pretraining.
In experiments, we compare our model with discriminative and generative models pretrained on the same dataset.
- Score: 20.898477720723573
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A universal classification model aims to generalize to diverse classification
tasks in both zero and few shot settings. A promising way toward universal
classification is to cast heterogeneous data formats into a dataset-agnostic
"meta-task" (e.g., textual entailment, question answering) then pretrain a
model on the combined meta dataset. The existing work is either pretrained on
specific subsets of classification tasks, or pretrained on both classification
and generation data but the model could not fulfill its potential in
universality and reliability. These also leave a massive amount of annotated
data under-exploited. To fill these gaps, we propose ConEntail, a new framework
for universal zero and few shot classification with supervised contrastive
pretraining. Our unified meta-task for classification is based on nested
entailment. It can be interpreted as "Does sentence a entails [sentence b
entails label c]". This formulation enables us to make better use of 57
annotated classification datasets for supervised contrastive pretraining and
universal evaluation. In this way, ConEntail helps the model (1) absorb
knowledge from different datasets, and (2) gain consistent performance gain
with more pretraining data. In experiments, we compare our model with
discriminative and generative models pretrained on the same dataset. The
results confirm that our framework effectively exploits existing annotated data
and consistently outperforms baselines in both zero (9.4% average improvement)
and few shot settings (3.5% average improvement).
Related papers
- Fair Classification with Partial Feedback: An Exploration-Based Data Collection Approach [15.008626822593]
In many predictive contexts, true outcomes are only observed for samples that were positively classified in the past.
We present an approach that trains a classifier using available data and comes with a family of exploration strategies.
We show that this approach consistently boosts the quality of collected outcome data and improves the fraction of true positives for all groups.
arXiv Detail & Related papers (2024-02-17T17:09:19Z) - Memory Consistency Guided Divide-and-Conquer Learning for Generalized
Category Discovery [56.172872410834664]
Generalized category discovery (GCD) aims at addressing a more realistic and challenging setting of semi-supervised learning.
We propose a Memory Consistency guided Divide-and-conquer Learning framework (MCDL)
Our method outperforms state-of-the-art models by a large margin on both seen and unseen classes of the generic image recognition.
arXiv Detail & Related papers (2024-01-24T09:39:45Z) - Universal Semi-supervised Model Adaptation via Collaborative Consistency
Training [92.52892510093037]
We introduce a realistic and challenging domain adaptation problem called Universal Semi-supervised Model Adaptation (USMA)
We propose a collaborative consistency training framework that regularizes the prediction consistency between two models.
Experimental results demonstrate the effectiveness of our method on several benchmark datasets.
arXiv Detail & Related papers (2023-07-07T08:19:40Z) - Estimating Confidence of Predictions of Individual Classifiers and Their
Ensembles for the Genre Classification Task [0.0]
Genre identification is a subclass of non-topical text classification.
Nerve models based on pre-trained transformers, such as BERT or XLM-RoBERTa, demonstrate SOTA results in many NLP tasks.
arXiv Detail & Related papers (2022-06-15T09:59:05Z) - Unifying Language Learning Paradigms [96.35981503087567]
We present a unified framework for pre-training models that are universally effective across datasets and setups.
We show how different pre-training objectives can be cast as one another and how interpolating between different objectives can be effective.
Our model also achieve strong results at in-context learning, outperforming 175B GPT-3 on zero-shot SuperGLUE and tripling the performance of T5-XXL on one-shot summarization.
arXiv Detail & Related papers (2022-05-10T19:32:20Z) - Cluster & Tune: Boost Cold Start Performance in Text Classification [21.957605438780224]
In real-world scenarios, a text classification task often begins with a cold start, when labeled data is scarce.
We suggest a method to boost the performance of such models by adding an intermediate unsupervised classification task.
arXiv Detail & Related papers (2022-03-20T15:29:34Z) - Resolving label uncertainty with implicit posterior models [71.62113762278963]
We propose a method for jointly inferring labels across a collection of data samples.
By implicitly assuming the existence of a generative model for which a differentiable predictor is the posterior, we derive a training objective that allows learning under weak beliefs.
arXiv Detail & Related papers (2022-02-28T18:09:44Z) - Does Data Repair Lead to Fair Models? Curating Contextually Fair Data To
Reduce Model Bias [10.639605996067534]
Contextual information is a valuable cue for Deep Neural Networks (DNNs) to learn better representations and improve accuracy.
In COCO, many object categories have a much higher co-occurrence with men compared to women, which can bias a DNN's prediction in favor of men.
We introduce a data repair algorithm using the coefficient of variation, which can curate fair and contextually balanced data for a protected class.
arXiv Detail & Related papers (2021-10-20T06:00:03Z) - No Fear of Heterogeneity: Classifier Calibration for Federated Learning
with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data.
We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model.
Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z) - Set-valued classification -- overview via a unified framework [15.109906768606644]
Multi-class datasets can be extremely ambiguous and single-output predictions fail to deliver satisfactory performance.
By allowing predictors to predict a set of label candidates, set-valued classification offers a natural way to deal with this ambiguity.
We provide infinite sample optimal set-valued classification strategies and review a general plug-in principle to construct data-driven algorithms.
arXiv Detail & Related papers (2021-02-24T14:54:07Z) - Semi-Supervised Models via Data Augmentationfor Classifying Interactive
Affective Responses [85.04362095899656]
We present semi-supervised models with data augmentation (SMDA), a semi-supervised text classification system to classify interactive affective responses.
For labeled sentences, we performed data augmentations to uniform the label distributions and computed supervised loss during training process.
For unlabeled sentences, we explored self-training by regarding low-entropy predictions over unlabeled sentences as pseudo labels.
arXiv Detail & Related papers (2020-04-23T05:02:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.