Improving Compositional Generalization Using Iterated Learning and
Simplicial Embeddings
- URL: http://arxiv.org/abs/2310.18777v1
- Date: Sat, 28 Oct 2023 18:30:30 GMT
- Title: Improving Compositional Generalization Using Iterated Learning and
Simplicial Embeddings
- Authors: Yi Ren, Samuel Lavoie, Mikhail Galkin, Danica J. Sutherland, Aaron
Courville
- Abstract summary: Compositional generalization is easy for humans but hard for deep neural networks.
We propose to improve this ability by using iterated learning on models with simplicial embeddings.
We show that this combination of changes improves compositional generalization over other approaches.
- Score: 19.667133565610087
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Compositional generalization, the ability of an agent to generalize to unseen
combinations of latent factors, is easy for humans but hard for deep neural
networks. A line of research in cognitive science has hypothesized a process,
``iterated learning,'' to help explain how human language developed this
ability; the theory rests on simultaneous pressures towards compressibility
(when an ignorant agent learns from an informed one) and expressivity (when it
uses the representation for downstream tasks). Inspired by this process, we
propose to improve the compositional generalization of deep networks by using
iterated learning on models with simplicial embeddings, which can approximately
discretize representations. This approach is further motivated by an analysis
of compositionality based on Kolmogorov complexity. We show that this
combination of changes improves compositional generalization over other
approaches, demonstrating these improvements both on vision tasks with
well-understood latent factors and on real molecular graph prediction tasks
where the latent structure is unknown.
Related papers
- Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network.
Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z) - Vector-based Representation is the Key: A Study on Disentanglement and
Compositional Generalization [77.57425909520167]
We show that it is possible to achieve both good concept recognition and novel concept composition.
We propose a method to reform the scalar-based disentanglement works to be vector-based to increase both capabilities.
arXiv Detail & Related papers (2023-05-29T13:05:15Z) - On Neural Architecture Inductive Biases for Relational Tasks [76.18938462270503]
We introduce a simple architecture based on similarity-distribution scores which we name Compositional Network generalization (CoRelNet)
We find that simple architectural choices can outperform existing models in out-of-distribution generalizations.
arXiv Detail & Related papers (2022-06-09T16:24:01Z) - Compositional Processing Emerges in Neural Networks Solving Math
Problems [100.80518350845668]
Recent progress in artificial neural networks has shown that when large models are trained on enough linguistic data, grammatical structure emerges in their representations.
We extend this work to the domain of mathematical reasoning, where it is possible to formulate precise hypotheses about how meanings should be composed.
Our work shows that neural networks are not only able to infer something about the structured relationships implicit in their training data, but can also deploy this knowledge to guide the composition of individual meanings into composite wholes.
arXiv Detail & Related papers (2021-05-19T07:24:42Z) - An Ensemble with Shared Representations Based on Convolutional Networks
for Continually Learning Facial Expressions [19.72032908764253]
Semi-supervised learning through ensemble predictions is an efficient strategy to leverage the high exposure of unlabelled facial expressions during human-robot interactions.
Traditional ensemble-based systems are composed of several independent classifiers leading to a high degree of redundancy.
We show that our approach is able to continually learn facial expressions through ensemble predictions using unlabelled samples from different data distributions.
arXiv Detail & Related papers (2021-03-05T20:40:52Z) - Compositional Generalization by Learning Analytical Expressions [87.15737632096378]
A memory-augmented neural model is connected with analytical expressions to achieve compositional generalization.
Experiments on the well-known benchmark SCAN demonstrate that our model seizes a great ability of compositional generalization.
arXiv Detail & Related papers (2020-06-18T15:50:57Z) - Structural Inductive Biases in Emergent Communication [36.26083882473554]
We investigate the impact of representation learning in artificial agents by developing graph referential games.
We show that agents parametrized by graph neural networks develop a more compositional language compared to bag-of-words and sequence models.
arXiv Detail & Related papers (2020-02-04T14:59:08Z) - Understanding Generalization in Deep Learning via Tensor Methods [53.808840694241]
We advance the understanding of the relations between the network's architecture and its generalizability from the compression perspective.
We propose a series of intuitive, data-dependent and easily-measurable properties that tightly characterize the compressibility and generalizability of neural networks.
arXiv Detail & Related papers (2020-01-14T22:26:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.