Learning by Analogy: A Causal Framework for Composition Generalization
- URL: http://arxiv.org/abs/2512.10669v1
- Date: Thu, 11 Dec 2025 14:16:14 GMT
- Title: Learning by Analogy: A Causal Framework for Composition Generalization
- Authors: Lingjing Kong, Shaoan Xie, Yang Jiao, Yetian Chen, Yanhui Guo, Simone Shao, Yan Gao, Guangyi Chen, Kun Zhang,
- Abstract summary: We show that compositional generalization requires decomposing high-level concepts into basic, low-level concepts.<n>We formalize these intuitive processes using principles of causal modularity and minimal changes.<n>We demonstrate that this approach enables compositional generalization supporting complex relations between composed concepts.
- Score: 31.406412141788635
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Compositional generalization -- the ability to understand and generate novel combinations of learned concepts -- enables models to extend their capabilities beyond limited experiences. While effective, the data structures and principles that enable this crucial capability remain poorly understood. We propose that compositional generalization fundamentally requires decomposing high-level concepts into basic, low-level concepts that can be recombined across similar contexts, similar to how humans draw analogies between concepts. For example, someone who has never seen a peacock eating rice can envision this scene by relating it to their previous observations of a chicken eating rice. In this work, we formalize these intuitive processes using principles of causal modularity and minimal changes. We introduce a hierarchical data-generating process that naturally encodes different levels of concepts and their interaction mechanisms. Theoretically, we demonstrate that this approach enables compositional generalization supporting complex relations between composed concepts, advancing beyond prior work that assumes simpler interactions like additive effects. Critically, we also prove that this latent hierarchical structure is provably recoverable (identifiable) from observable data like text-image pairs, a necessary step for learning such a generative process. To validate our theory, we apply insights from our theoretical framework and achieve significant improvements on benchmark datasets.
Related papers
- Nonparametric Identification of Latent Concepts [17.996329262929113]
We argue that the cognitive mechanism of comparison, fundamental to human learning, is also vital for machines to recover true concepts underlying the data.<n>Specifically, we aim to develop a theoretical framework for the identifiability of concepts with multiple classes of observations.<n>We show that with sufficient diversity across classes, hidden concepts can be identified without assuming specific concept types.
arXiv Detail & Related papers (2025-09-30T18:13:53Z) - Swing-by Dynamics in Concept Learning and Compositional Generalization [41.84127616263314]
We introduce a structured identity mapping (SIM) task, where a model is trained to learn the identity mapping on a Gaussian mixture with structurally organized centroids.<n>We mathematically analyze the learning dynamics of neural networks trained on this SIM task and show that, despite its simplicity, SIM's learning dynamics capture and help explain key empirical observations.<n>Our theory also offers several new insights -- e.g., we find a novel mechanism for non-monotonic learning dynamics of test loss in early phases of training.
arXiv Detail & Related papers (2024-10-10T18:58:29Z) - Foundations and Frontiers of Graph Learning Theory [81.39078977407719]
Recent advancements in graph learning have revolutionized the way to understand and analyze data with complex structures.
Graph Neural Networks (GNNs), i.e. neural network architectures designed for learning graph representations, have become a popular paradigm.
This article provides a comprehensive summary of the theoretical foundations and breakthroughs concerning the approximation and learning behaviors intrinsic to prevalent graph learning models.
arXiv Detail & Related papers (2024-07-03T14:07:41Z) - Learning Discrete Concepts in Latent Hierarchical Models [73.01229236386148]
Learning concepts from natural high-dimensional data holds potential in building human-aligned and interpretable machine learning models.<n>We formalize concepts as discrete latent causal variables that are related via a hierarchical causal model.<n>We substantiate our theoretical claims with synthetic data experiments.
arXiv Detail & Related papers (2024-06-01T18:01:03Z) - When does compositional structure yield compositional generalization? A kernel theory [0.0]
We present a theory of compositional generalization in kernel models with fixed, compositionally structured representations.<n>We identify novel failure modes in compositional generalization that arise from biases in the training data.<n>This work examines how statistical structure in the training data can affect compositional generalization.
arXiv Detail & Related papers (2024-05-26T00:50:11Z) - Improving Compositional Generalization Using Iterated Learning and
Simplicial Embeddings [19.667133565610087]
Compositional generalization is easy for humans but hard for deep neural networks.
We propose to improve this ability by using iterated learning on models with simplicial embeddings.
We show that this combination of changes improves compositional generalization over other approaches.
arXiv Detail & Related papers (2023-10-28T18:30:30Z) - Provable Compositional Generalization for Object-Centric Learning [55.658215686626484]
Learning representations that generalize to novel compositions of known concepts is crucial for bridging the gap between human and machine perception.
We show that autoencoders that satisfy structural assumptions on the decoder and enforce encoder-decoder consistency will learn object-centric representations that provably generalize compositionally.
arXiv Detail & Related papers (2023-10-09T01:18:07Z) - A Recursive Bateson-Inspired Model for the Generation of Semantic Formal
Concepts from Spatial Sensory Data [77.34726150561087]
This paper presents a new symbolic-only method for the generation of hierarchical concept structures from complex sensory data.
The approach is based on Bateson's notion of difference as the key to the genesis of an idea or a concept.
The model is able to produce fairly rich yet human-readable conceptual representations without training.
arXiv Detail & Related papers (2023-07-16T15:59:13Z) - Vector-based Representation is the Key: A Study on Disentanglement and
Compositional Generalization [77.57425909520167]
We show that it is possible to achieve both good concept recognition and novel concept composition.
We propose a method to reform the scalar-based disentanglement works to be vector-based to increase both capabilities.
arXiv Detail & Related papers (2023-05-29T13:05:15Z) - A Probabilistic-Logic based Commonsense Representation Framework for
Modelling Inferences with Multiple Antecedents and Varying Likelihoods [5.87677276882675]
Commonsense knowledge-graphs (CKGs) are important resources towards building machines that can'reason' on text or environmental inputs and make inferences beyond perception.
In this work, we study how commonsense knowledge can be better represented by -- (i) utilizing a probabilistic logic representation scheme to model composite inferential knowledge and represent conceptual beliefs with varying likelihoods, and (ii) incorporating a hierarchical conceptual ontology to identify salient concept-relevant relations and organize beliefs at different conceptual levels.
arXiv Detail & Related papers (2022-11-30T08:44:30Z) - Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions.
We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.