Exploring Continual Learning of Compositional Generalization in NLI
- URL: http://arxiv.org/abs/2403.04400v2
- Date: Thu, 25 Jul 2024 19:32:16 GMT
- Title: Exploring Continual Learning of Compositional Generalization in NLI
- Authors: Xiyan Fu, Anette Frank,
- Abstract summary: We introduce the Continual Compositional Generalization in Inference (C2Gen NLI) challenge.
A model continuously acquires knowledge of constituting primitive inference tasks as a basis for compositional inferences.
Our analyses show that by learning subtasks continuously while observing their dependencies and increasing degrees of difficulty, continual learning can enhance composition generalization ability.
- Score: 24.683598294766774
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Compositional Natural Language Inference has been explored to assess the true abilities of neural models to perform NLI. Yet, current evaluations assume models to have full access to all primitive inferences in advance, in contrast to humans that continuously acquire inference knowledge. In this paper, we introduce the Continual Compositional Generalization in Inference (C2Gen NLI) challenge, where a model continuously acquires knowledge of constituting primitive inference tasks as a basis for compositional inferences. We explore how continual learning affects compositional generalization in NLI, by designing a continual learning setup for compositional NLI inference tasks. Our experiments demonstrate that models fail to compositionally generalize in a continual scenario. To address this problem, we first benchmark various continual learning algorithms and verify their efficacy. We then further analyze C2Gen, focusing on how to order primitives and compositional inference types and examining correlations between subtasks. Our analyses show that by learning subtasks continuously while observing their dependencies and increasing degrees of difficulty, continual learning can enhance composition generalization ability.
Related papers
- Dynamic Post-Hoc Neural Ensemblers [55.15643209328513]
In this study, we explore employing neural networks as ensemble methods.
Motivated by the risk of learning low-diversity ensembles, we propose regularizing the model by randomly dropping base model predictions.
We demonstrate this approach lower bounds the diversity within the ensemble, reducing overfitting and improving generalization capabilities.
arXiv Detail & Related papers (2024-10-06T15:25:39Z) - How Well Do Text Embedding Models Understand Syntax? [50.440590035493074]
The ability of text embedding models to generalize across a wide range of syntactic contexts remains under-explored.
Our findings reveal that existing text embedding models have not sufficiently addressed these syntactic understanding challenges.
We propose strategies to augment the generalization ability of text embedding models in diverse syntactic scenarios.
arXiv Detail & Related papers (2023-11-14T08:51:00Z) - In-Context Learning Dynamics with Random Binary Sequences [16.645695664776433]
We propose a framework that enables us to analyze in-context learning dynamics.
Inspired by the cognitive science of human perception, we use random binary sequences as context.
In the latest GPT-3.5+ models, we find emergent abilities to generate seemingly random numbers and learn basic formal languages.
arXiv Detail & Related papers (2023-10-26T17:54:52Z) - Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models [68.18370230899102]
We investigate how to elicit compositional generalization capabilities in large language models (LLMs)
We find that demonstrating both foundational skills and compositional examples grounded in these skills within the same prompt context is crucial.
We show that fine-tuning LLMs with SKiC-style data can elicit zero-shot weak-to-strong generalization.
arXiv Detail & Related papers (2023-08-01T05:54:12Z) - Investigating Forgetting in Pre-Trained Representations Through
Continual Learning [51.30807066570425]
We study the effect of representation forgetting on the generality of pre-trained language models.
We find that the generality is destructed in various pre-trained LMs, and syntactic and semantic knowledge is forgotten through continual learning.
arXiv Detail & Related papers (2023-05-10T08:27:59Z) - Learning to Generalize Compositionally by Transferring Across Semantic
Parsing Tasks [37.66114618645146]
We investigate learning representations that facilitate transfer learning from one compositional task to another.
We apply this method to semantic parsing, using three very different datasets.
Our method significantly improves compositional generalization over baselines on the test set of the target task.
arXiv Detail & Related papers (2021-11-09T09:10:21Z) - Co$^2$L: Contrastive Continual Learning [69.46643497220586]
Recent breakthroughs in self-supervised learning show that such algorithms learn visual representations that can be transferred better to unseen tasks.
We propose a rehearsal-based continual learning algorithm that focuses on continually learning and maintaining transferable representations.
arXiv Detail & Related papers (2021-06-28T06:14:38Z) - Exploring Transitivity in Neural NLI Models through Veridicality [39.845425535943534]
We focus on the transitivity of inference relations, a fundamental property for systematically drawing inferences.
A model capturing transitivity can compose basic inference patterns and draw new inferences.
We find that current NLI models do not perform consistently well on transitivity inference tasks.
arXiv Detail & Related papers (2021-01-26T11:18:35Z) - Compositional Generalization by Learning Analytical Expressions [87.15737632096378]
A memory-augmented neural model is connected with analytical expressions to achieve compositional generalization.
Experiments on the well-known benchmark SCAN demonstrate that our model seizes a great ability of compositional generalization.
arXiv Detail & Related papers (2020-06-18T15:50:57Z) - Visually Grounded Continual Learning of Compositional Phrases [45.60521849859337]
VisCOLL simulates the continual acquisition of compositional phrases from streaming visual scenes.
Models are trained on a paired image-caption stream which has shifting object distribution.
They are constantly evaluated by a visually-grounded masked language prediction task on held-out test sets.
arXiv Detail & Related papers (2020-05-02T10:45:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.