Continual Learning of Nonlinear Independent Representations
- URL: http://arxiv.org/abs/2408.05788v1
- Date: Sun, 11 Aug 2024 14:33:37 GMT
- Title: Continual Learning of Nonlinear Independent Representations
- Authors: Boyang Sun, Ignavier Ng, Guangyi Chen, Yifan Shen, Qirong Ho, Kun Zhang,
- Abstract summary: We show that model identifiability progresses from a subspace level to a component-wise level as the number of distributions increases.
Our method achieves performance comparable to nonlinear ICA methods trained jointly on multiple offline distributions.
- Score: 17.65617189829692
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Identifying the causal relations between interested variables plays a pivotal role in representation learning as it provides deep insights into the dataset. Identifiability, as the central theme of this approach, normally hinges on leveraging data from multiple distributions (intervention, distribution shift, time series, etc.). Despite the exciting development in this field, a practical but often overlooked problem is: what if those distribution shifts happen sequentially? In contrast, any intelligence possesses the capacity to abstract and refine learned knowledge sequentially -- lifelong learning. In this paper, with a particular focus on the nonlinear independent component analysis (ICA) framework, we move one step forward toward the question of enabling models to learn meaningful (identifiable) representations in a sequential manner, termed continual causal representation learning. We theoretically demonstrate that model identifiability progresses from a subspace level to a component-wise level as the number of distributions increases. Empirically, we show that our method achieves performance comparable to nonlinear ICA methods trained jointly on multiple offline distributions and, surprisingly, the incoming new distribution does not necessarily benefit the identification of all latent variables.
Related papers
- Self-supervised contrastive learning performs non-linear system identification [2.393499494583001]
We show that self-supervised learning can perform system identification in latent space.
We propose DynCL, a framework to uncover linear, switching linear and non-linear dynamics under a non-linear observation model.
arXiv Detail & Related papers (2024-10-18T17:59:25Z) - Learning Divergence Fields for Shift-Robust Graph Representations [73.11818515795761]
In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging problem with interdependent data.
We derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains.
arXiv Detail & Related papers (2024-06-07T14:29:21Z) - Mitigating Shortcut Learning with Diffusion Counterfactuals and Diverse Ensembles [95.49699178874683]
We propose DiffDiv, an ensemble diversification framework exploiting Diffusion Probabilistic Models (DPMs)
We show that DPMs can generate images with novel feature combinations, even when trained on samples displaying correlated input features.
We show that DPM-guided diversification is sufficient to remove dependence on shortcut cues, without a need for additional supervised signals.
arXiv Detail & Related papers (2023-11-23T15:47:33Z) - Multi-Domain Causal Representation Learning via Weak Distributional
Invariances [27.72497122405241]
Causal representation learning has emerged as the center of action in causal machine learning research.
We show that autoencoders that incorporate such invariances can provably identify the stable set of latents from the rest across different settings.
arXiv Detail & Related papers (2023-10-04T14:41:41Z) - Leveraging Diffusion Disentangled Representations to Mitigate Shortcuts
in Underspecified Visual Tasks [92.32670915472099]
We propose an ensemble diversification framework exploiting the generation of synthetic counterfactuals using Diffusion Probabilistic Models (DPMs)
We show that diffusion-guided diversification can lead models to avert attention from shortcut cues, achieving ensemble diversity performance comparable to previous methods requiring additional data collection.
arXiv Detail & Related papers (2023-10-03T17:37:52Z) - Learning Linear Causal Representations from Interventions under General
Nonlinear Mixing [52.66151568785088]
We prove strong identifiability results given unknown single-node interventions without access to the intervention targets.
This is the first instance of causal identifiability from non-paired interventions for deep neural network embeddings.
arXiv Detail & Related papers (2023-06-04T02:32:12Z) - Leveraging sparse and shared feature activations for disentangled
representation learning [112.22699167017471]
We propose to leverage knowledge extracted from a diversified set of supervised tasks to learn a common disentangled representation.
We validate our approach on six real world distribution shift benchmarks, and different data modalities.
arXiv Detail & Related papers (2023-04-17T01:33:24Z) - Temporally Disentangled Representation Learning [14.762231867144065]
It is unknown whether the underlying latent variables and their causal relations are identifiable if they have arbitrary, nonparametric causal influences in between.
We propose textbftextttTDRL, a principled framework to recover time-delayed latent causal variables.
Our approach considerably outperforms existing baselines that do not correctly exploit this modular representation of changes.
arXiv Detail & Related papers (2022-10-24T23:02:49Z) - Generalized Representations Learning for Time Series Classification [28.230863650758447]
We argue that the temporal complexity attributes to the unknown latent distributions within time series classification.
We present experiments on gesture recognition, speech commands recognition, wearable stress and affect detection, and sensor-based human activity recognition.
arXiv Detail & Related papers (2022-09-15T03:36:31Z) - From latent dynamics to meaningful representations [0.5728954513076778]
We propose a purely dynamics-constrained representation learning framework.
We show this is a more natural constraint for representation learning in dynamical systems.
We validate our framework for different systems including a real-world fluorescent DNA movie dataset.
arXiv Detail & Related papers (2022-09-02T09:27:37Z) - Learning from Heterogeneous Data Based on Social Interactions over
Graphs [58.34060409467834]
This work proposes a decentralized architecture, where individual agents aim at solving a classification problem while observing streaming features of different dimensions.
We show that the.
strategy enables the agents to learn consistently under this highly-heterogeneous setting.
We show that the.
strategy enables the agents to learn consistently under this highly-heterogeneous setting.
arXiv Detail & Related papers (2021-12-17T12:47:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.