LaSQuE: Improved Zero-Shot Classification from Explanations Through
Quantifier Modeling and Curriculum Learning
- URL: http://arxiv.org/abs/2212.09104v1
- Date: Sun, 18 Dec 2022 15:10:05 GMT
- Title: LaSQuE: Improved Zero-Shot Classification from Explanations Through
Quantifier Modeling and Curriculum Learning
- Authors: Sayan Ghosh, Rakesh R Menon, Shashank Srivastava
- Abstract summary: We present LaSQuE, a method that can learn zero-shot classifiers from language explanations by using three new strategies.
With these strategies, LaSQuE outperforms prior work, showing an absolute gain of up to 7% in generalizing to unseen real-world classification tasks.
- Score: 12.278877764015725
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A hallmark of human intelligence is the ability to learn new concepts purely
from language. Several recent approaches have explored training machine
learning models via natural language supervision. However, these approaches
fall short in leveraging linguistic quantifiers (such as 'always' or 'rarely')
and mimicking humans in compositionally learning complex tasks. Here, we
present LaSQuE, a method that can learn zero-shot classifiers from language
explanations by using three new strategies - (1) modeling the semantics of
linguistic quantifiers in explanations (including exploiting ordinal strength
relationships, such as 'always' > 'likely'), (2) aggregating information from
multiple explanations using an attention-based mechanism, and (3) model
training via curriculum learning. With these strategies, LaSQuE outperforms
prior work, showing an absolute gain of up to 7% in generalizing to unseen
real-world classification tasks.
Related papers
- In-Context Language Learning: Architectures and Algorithms [73.93205821154605]
We study ICL through the lens of a new family of model problems we term in context language learning (ICLL)
We evaluate a diverse set of neural sequence models on regular ICLL tasks.
arXiv Detail & Related papers (2024-01-23T18:59:21Z) - Less is More: A Closer Look at Semantic-based Few-Shot Learning [11.724194320966959]
Few-shot Learning aims to learn and distinguish new categories with a very limited number of available images.
We propose a simple but effective framework for few-shot learning tasks, specifically designed to exploit the textual information and language model.
Our experiments conducted across four widely used few-shot datasets demonstrate that our simple framework achieves impressive results.
arXiv Detail & Related papers (2024-01-10T08:56:02Z) - Continual Zero-Shot Learning through Semantically Guided Generative
Random Walks [56.65465792750822]
We address the challenge of continual zero-shot learning where unseen information is not provided during training, by leveraging generative modeling.
We propose our learning algorithm that employs a novel semantically guided Generative Random Walk (GRW) loss.
Our algorithm achieves state-of-the-art performance on AWA1, AWA2, CUB, and SUN datasets, surpassing existing CZSL methods by 3-7%.
arXiv Detail & Related papers (2023-08-23T18:10:12Z) - Commonsense Knowledge Transfer for Pre-trained Language Models [83.01121484432801]
We introduce commonsense knowledge transfer, a framework to transfer the commonsense knowledge stored in a neural commonsense knowledge model to a general-purpose pre-trained language model.
It first exploits general texts to form queries for extracting commonsense knowledge from the neural commonsense knowledge model.
It then refines the language model with two self-supervised objectives: commonsense mask infilling and commonsense relation prediction.
arXiv Detail & Related papers (2023-06-04T15:44:51Z) - A Simple Meta-learning Paradigm for Zero-shot Intent Classification with
Mixture Attention Mechanism [17.228616743739412]
We propose a simple yet effective meta-learning paradigm for zero-shot intent classification.
To learn better semantic representations for utterances, we introduce a new mixture attention mechanism.
To strengthen the transfer ability of the model from seen classes to unseen classes, we reformulate zero-shot intent classification with a meta-learning strategy.
arXiv Detail & Related papers (2022-06-05T13:37:51Z) - CoLLIE: Continual Learning of Language Grounding from Language-Image
Embeddings [2.8478710949588284]
CoLLIE is a model for continual learning of how language is grounded in vision.
It learns a transformation function that adjusts the language embeddings when needed to accommodate new language use.
We show that CoLLIE can efficiently learn and generalize from only a few examples.
arXiv Detail & Related papers (2021-11-15T18:54:58Z) - A Survey of Knowledge Enhanced Pre-trained Models [28.160826399552462]
We refer to pre-trained language models with knowledge injection as knowledge-enhanced pre-trained language models (KEPLMs)
These models demonstrate deep understanding and logical reasoning and introduce interpretability.
arXiv Detail & Related papers (2021-10-01T08:51:58Z) - SLM: Learning a Discourse Language Representation with Sentence
Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation.
We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z) - Cross-lingual Spoken Language Understanding with Regularized
Representation Alignment [71.53159402053392]
We propose a regularization approach to align word-level and sentence-level representations across languages without any external resource.
Experiments on the cross-lingual spoken language understanding task show that our model outperforms current state-of-the-art methods in both few-shot and zero-shot scenarios.
arXiv Detail & Related papers (2020-09-30T08:56:53Z) - ALICE: Active Learning with Contrastive Natural Language Explanations [69.03658685761538]
We propose Active Learning with Contrastive Explanations (ALICE) to improve data efficiency in learning.
ALICE learns to first use active learning to select the most informative pairs of label classes to elicit contrastive natural language explanations.
It extracts knowledge from these explanations using a semantically extracted knowledge.
arXiv Detail & Related papers (2020-09-22T01:02:07Z) - Systematic Generalization on gSCAN with Language Conditioned Embedding [19.39687991647301]
Systematic Generalization refers to a learning algorithm's ability to extrapolate learned behavior to unseen situations.
We propose a novel method that learns objects' contextualized embeddings with dynamic message passing conditioned on the input natural language.
arXiv Detail & Related papers (2020-09-11T17:35:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.