Identifiable Causal Representation Learning: Unsupervised, Multi-View, and Multi-Environment
- URL: http://arxiv.org/abs/2406.13371v1
- Date: Wed, 19 Jun 2024 09:14:40 GMT
- Title: Identifiable Causal Representation Learning: Unsupervised, Multi-View, and Multi-Environment
- Authors: Julius von Kügelgen,
- Abstract summary: Causal representation learning aims to combine the core strengths of machine learning and causality.
This thesis investigates what is possible for CRL without direct supervision, and thus contributes to its theoretical foundations.
- Score: 10.814585613336778
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Causal models provide rich descriptions of complex systems as sets of mechanisms by which each variable is influenced by its direct causes. They support reasoning about manipulating parts of the system and thus hold promise for addressing some of the open challenges of artificial intelligence (AI), such as planning, transferring knowledge in changing environments, or robustness to distribution shifts. However, a key obstacle to more widespread use of causal models in AI is the requirement that the relevant variables be specified a priori, which is typically not the case for the high-dimensional, unstructured data processed by modern AI systems. At the same time, machine learning (ML) has proven quite successful at automatically extracting useful and compact representations of such complex data. Causal representation learning (CRL) aims to combine the core strengths of ML and causality by learning representations in the form of latent variables endowed with causal model semantics. In this thesis, we study and present new results for different CRL settings. A central theme is the question of identifiability: Given infinite data, when are representations satisfying the same learning objective guaranteed to be equivalent? This is an important prerequisite for CRL, as it formally characterises if and when a learning task is, at least in principle, feasible. Since learning causal models, even without a representation learning component, is notoriously difficult, we require additional assumptions on the model class or rich data beyond the classical i.i.d. setting. By partially characterising identifiability for different settings, this thesis investigates what is possible for CRL without direct supervision, and thus contributes to its theoretical foundations. Ideally, the developed insights can help inform data collection practices or inspire the design of new practical estimation methods.
Related papers
- A Critical Assessment of Interpretable and Explainable Machine Learning for Intrusion Detection [0.0]
We study the use of overly complex and opaque ML models, unaccounted data imbalances and correlated features, inconsistent influential features across different explanation methods, and the implausible utility of explanations.
Specifically, we advise avoiding complex opaque models such as Deep Neural Networks and instead using interpretable ML models such as Decision Trees.
We find that feature-based model explanations are most often inconsistent across different settings.
arXiv Detail & Related papers (2024-07-04T15:35:42Z) - A Probabilistic Model Behind Self-Supervised Learning [53.64989127914936]
In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels.
We present a generative latent variable model for self-supervised learning.
We show that several families of discriminative SSL, including contrastive methods, induce a comparable distribution over representations.
arXiv Detail & Related papers (2024-02-02T13:31:17Z) - Targeted Reduction of Causal Models [55.11778726095353]
Causal Representation Learning offers a promising avenue to uncover interpretable causal patterns in simulations.
We introduce Targeted Causal Reduction (TCR), a method for condensing complex intervenable models into a concise set of causal factors.
Its ability to generate interpretable high-level explanations from complex models is demonstrated on toy and mechanical systems.
arXiv Detail & Related papers (2023-11-30T15:46:22Z) - Beyond Convergence: Identifiability of Machine Learning and Deep
Learning Models [0.0]
We investigate the notion of model parameter identifiability through a case study focused on parameter estimation from motion sensor data.
We employ a deep neural network to estimate subject-wise parameters, including mass, stiffness, and equilibrium leg length.
The results show that while certain parameters can be identified from the observation data, others remain unidentifiable.
arXiv Detail & Related papers (2023-07-21T03:40:53Z) - INFOrmation Prioritization through EmPOWERment in Visual Model-Based RL [90.06845886194235]
We propose a modified objective for model-based reinforcement learning (RL)
We integrate a term inspired by variational empowerment into a state-space model based on mutual information.
We evaluate the approach on a suite of vision-based robot control tasks with natural video backgrounds.
arXiv Detail & Related papers (2022-04-18T23:09:23Z) - Systematic Evaluation of Causal Discovery in Visual Model Based
Reinforcement Learning [76.00395335702572]
A central goal for AI and causality is the joint discovery of abstract representations and causal structure.
Existing environments for studying causal induction are poorly suited for this objective because they have complicated task-specific causal graphs.
In this work, our goal is to facilitate research in learning representations of high-level variables as well as causal structures among them.
arXiv Detail & Related papers (2021-07-02T05:44:56Z) - The Causal Neural Connection: Expressiveness, Learnability, and
Inference [125.57815987218756]
An object called structural causal model (SCM) represents a collection of mechanisms and sources of random variation of the system under investigation.
In this paper, we show that the causal hierarchy theorem (Thm. 1, Bareinboim et al., 2020) still holds for neural models.
We introduce a special type of SCM called a neural causal model (NCM), and formalize a new type of inductive bias to encode structural constraints necessary for performing causal inferences.
arXiv Detail & Related papers (2021-07-02T01:55:18Z) - High-dimensional separability for one- and few-shot learning [58.8599521537]
This work is driven by a practical question, corrections of Artificial Intelligence (AI) errors.
Special external devices, correctors, are developed. They should provide quick and non-iterative system fix without modification of a legacy AI system.
New multi-correctors of AI systems are presented and illustrated with examples of predicting errors and learning new classes of objects by a deep convolutional neural network.
arXiv Detail & Related papers (2021-06-28T14:58:14Z) - From Undecidability of Non-Triviality and Finiteness to Undecidability
of Learnability [0.0]
We show that there is no general-purpose procedure for rigorously evaluating whether newly proposed models indeed successfully learn from data.
For PAC binary classification, uniform and universal online learning, and exact learning through teacher-learner interactions, learnability is in general undecidable.
There is no one-size-fits-all algorithm for deciding whether a machine learning model can be successful.
arXiv Detail & Related papers (2021-06-02T18:00:04Z) - Nonlinear Invariant Risk Minimization: A Causal Approach [5.63479133344366]
We propose a learning paradigm that enables out-of-distribution generalization in the nonlinear setting.
We show identifiability of the data representation up to very simple transformations.
Extensive experiments on both synthetic and real-world datasets show that our approach significantly outperforms a variety of baseline methods.
arXiv Detail & Related papers (2021-02-24T15:38:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.