Compositional Generalization from First Principles
- URL: http://arxiv.org/abs/2307.05596v1
- Date: Mon, 10 Jul 2023 19:30:32 GMT
- Title: Compositional Generalization from First Principles
- Authors: Thadd\"aus Wiedemer, Prasanna Mayilvahanan, Matthias Bethge, Wieland
Brendel
- Abstract summary: We investigate compositionality as a property of the data-generating process rather than the data itself.
This reformulation enables us to derive mild conditions on only the support of the training distribution and the model architecture.
Our results set the stage for a principled theoretical study of compositional generalization.
- Score: 27.243195680442533
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Leveraging the compositional nature of our world to expedite learning and
facilitate generalization is a hallmark of human perception. In machine
learning, on the other hand, achieving compositional generalization has proven
to be an elusive goal, even for models with explicit compositional priors. To
get a better handle on compositional generalization, we here approach it from
the bottom up: Inspired by identifiable representation learning, we investigate
compositionality as a property of the data-generating process rather than the
data itself. This reformulation enables us to derive mild conditions on only
the support of the training distribution and the model architecture, which are
sufficient for compositional generalization. We further demonstrate how our
theoretical framework applies to real-world scenarios and validate our findings
empirically. Our results set the stage for a principled theoretical study of
compositional generalization.
Related papers
- Dynamics of Concept Learning and Compositional Generalization [23.43600409313907]
We introduce a structured identity mapping (SIM) task, where a model is trained to learn the identity mapping on a Gaussian mixture with structurally organized centroids.
We mathematically analyze the learning dynamics of neural networks trained on this SIM task and show that, despite its simplicity, SIM's learning dynamics capture and help explain key empirical observations.
Our theory also offers several new insights -- e.g., we find a novel mechanism for non-monotonic learning dynamics of test loss in early phases of training.
arXiv Detail & Related papers (2024-10-10T18:58:29Z) - When does compositional structure yield compositional generalization? A kernel theory [0.0]
We present a theory of compositional generalization in kernel models with fixed representations.
We identify novel failure modes in compositional generalization that arise from biases in the training data.
This work provides a theoretical perspective on how statistical structure in the training data can affect compositional generalization.
arXiv Detail & Related papers (2024-05-26T00:50:11Z) - What makes Models Compositional? A Theoretical View: With Supplement [60.284698521569936]
We propose a general neuro-symbolic definition of compositional functions and their compositional complexity.
We show how various existing general and special purpose sequence processing models fit this definition and use it to analyze their compositional complexity.
arXiv Detail & Related papers (2024-05-02T20:10:27Z) - Provable Compositional Generalization for Object-Centric Learning [57.42720932595342]
Learning representations that generalize to novel compositions of known concepts is crucial for bridging the gap between human and machine perception.
We show that autoencoders that satisfy structural assumptions on the decoder and enforce encoder-decoder consistency will learn object-centric representations that provably generalize compositionally.
arXiv Detail & Related papers (2023-10-09T01:18:07Z) - Compositional Generalization in Unsupervised Compositional
Representation Learning: A Study on Disentanglement and Emergent Language [48.37815764394315]
We study three unsupervised representation learning algorithms on two datasets that allow directly testing compositional generalization.
We find that directly using the bottleneck representation with simple models and few labels may lead to worse generalization than using representations from layers before or after the learned representation itself.
Surprisingly, we find that increasing pressure to produce a disentangled representation produces representations with worse generalization, while representations from EL models show strong compositional generalization.
arXiv Detail & Related papers (2022-10-02T10:35:53Z) - Toward Compositional Generalization in Object-Oriented World Modeling [6.463111870767873]
We focus on the setting of reinforcement learning in object-oriented environments to study compositional generalization in world modeling.
We introduce a conceptual environment, Object Library, and two instances, and deploy a principled pipeline to measure the generalization ability.
Motivated by the formulation, we analyze several methods with exact or no compositional generalization ability using our framework, and design a differentiable approach, Homomorphic Object-oriented World Model (HOWM)
arXiv Detail & Related papers (2022-04-28T17:22:45Z) - Disentangled Sequence to Sequence Learning for Compositional
Generalization [62.954842223732435]
We propose an extension to sequence-to-sequence models which allows us to learn disentangled representations by adaptively re-encoding the source input.
Experimental results on semantic parsing and machine translation empirically show that our proposal yields more disentangled representations and better generalization.
arXiv Detail & Related papers (2021-10-09T22:27:19Z) - Contrastive Syn-to-Real Generalization [125.54991489017854]
We make a key observation that the diversity of the learned feature embeddings plays an important role in the generalization performance.
We propose contrastive synthetic-to-real generalization (CSG), a novel framework that leverages the pre-trained ImageNet knowledge to prevent overfitting to the synthetic domain.
We demonstrate the effectiveness of CSG on various synthetic training tasks, exhibiting state-of-the-art performance on zero-shot domain generalization.
arXiv Detail & Related papers (2021-04-06T05:10:29Z) - Compositional Generalization by Learning Analytical Expressions [87.15737632096378]
A memory-augmented neural model is connected with analytical expressions to achieve compositional generalization.
Experiments on the well-known benchmark SCAN demonstrate that our model seizes a great ability of compositional generalization.
arXiv Detail & Related papers (2020-06-18T15:50:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.