Causal Abstraction Learning based on the Semantic Embedding Principle
- URL: http://arxiv.org/abs/2502.00407v1
- Date: Sat, 01 Feb 2025 11:54:44 GMT
- Title: Causal Abstraction Learning based on the Semantic Embedding Principle
- Authors: Gabriele D'Acunto, Fabio Massimo Zennaro, Yorgos Felekis, Paolo Di Lorenzo,
- Abstract summary: Structural causal models (SCM) allow us to investigate complex systems at multiple levels of resolution.
We present a category-theoretic approach to SCMs that enables the learning of a CA.
We demonstrate that the proposed methods succeed on both synthetic and real-world brain data with different degrees of prior information about the structure of CA.
- Score: 8.867171632530908
- License:
- Abstract: Structural causal models (SCMs) allow us to investigate complex systems at multiple levels of resolution. The causal abstraction (CA) framework formalizes the mapping between high- and low-level SCMs. We address CA learning in a challenging and realistic setting, where SCMs are inaccessible, interventional data is unavailable, and sample data is misaligned. A key principle of our framework is $\textit{semantic embedding}$, formalized as the high-level distribution lying on a subspace of the low-level one. This principle naturally links linear CA to the geometry of the $\textit{Stiefel manifold}$. We present a category-theoretic approach to SCMs that enables the learning of a CA by finding a morphism between the low- and high-level probability measures, adhering to the semantic embedding principle. Consequently, we formulate a general CA learning problem. As an application, we solve the latter problem for linear CA; considering Gaussian measures and the Kullback-Leibler divergence as an objective. Given the nonconvexity of the learning task, we develop three algorithms building upon existing paradigms for Riemannian optimization. We demonstrate that the proposed methods succeed on both synthetic and real-world brain data with different degrees of prior information about the structure of CA.
Related papers
- AdaCBM: An Adaptive Concept Bottleneck Model for Explainable and Accurate Diagnosis [38.16978432272716]
The integration of vision-language models such as CLIP and Concept Bottleneck Models (CBMs) offers a promising approach to explaining deep neural network (DNN) decisions.
While CLIP provides both explainability and zero-shot classification capability, its pre-training on generic image and text data may limit its classification accuracy and applicability to medical image diagnostic tasks.
This paper takes an unconventional approach by re-examining the CBM framework through the lens of its geometrical representation as a simple linear classification system.
arXiv Detail & Related papers (2024-08-04T11:59:09Z) - Rethinking Few-shot 3D Point Cloud Semantic Segmentation [62.80639841429669]
This paper revisits few-shot 3D point cloud semantic segmentation (FS-PCS)
We focus on two significant issues in the state-of-the-art: foreground leakage and sparse point distribution.
To address these issues, we introduce a standardized FS-PCS setting, upon which a new benchmark is built.
arXiv Detail & Related papers (2024-03-01T15:14:47Z) - Causal Optimal Transport of Abstractions [8.642152250082368]
Causal abstraction (CA) theory establishes formal criteria for relating multiple structural causal models (SCMs) at different levels of granularity.
We propose COTA, the first method to learn abstraction maps from observational and interventional data without assuming complete knowledge of the underlying SCMs.
We extensively evaluate COTA on synthetic and real world problems, and showcase its advantages over non-causal, independent and aggregated COTA formulations.
arXiv Detail & Related papers (2023-12-13T12:54:34Z) - Exploiting Temporal Structures of Cyclostationary Signals for
Data-Driven Single-Channel Source Separation [98.95383921866096]
We study the problem of single-channel source separation (SCSS)
We focus on cyclostationary signals, which are particularly suitable in a variety of application domains.
We propose a deep learning approach using a U-Net architecture, which is competitive with the minimum MSE estimator.
arXiv Detail & Related papers (2022-08-22T14:04:56Z) - Switchable Representation Learning Framework with Self-compatibility [50.48336074436792]
We propose a Switchable representation learning Framework with Self-Compatibility (SFSC)
SFSC generates a series of compatible sub-models with different capacities through one training process.
SFSC achieves state-of-the-art performance on the evaluated datasets.
arXiv Detail & Related papers (2022-06-16T16:46:32Z) - Stabilizing Q-learning with Linear Architectures for Provably Efficient
Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation.
We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z) - Automated Imbalanced Classification via Layered Learning [0.734084539365505]
Applying resampling strategies to balance the class distribution of training instances is a common approach to tackle these problems.
Many state-of-the-art methods find instances of interest close to the decision boundary to drive the resampling process.
Over-sampling may increase the chance of overfitting by propagating the information contained in instances from the minority class.
arXiv Detail & Related papers (2022-05-05T10:32:24Z) - Class-Incremental Learning with Strong Pre-trained Models [97.84755144148535]
Class-incremental learning (CIL) has been widely studied under the setting of starting from a small number of classes (base classes)
We explore an understudied real-world setting of CIL that starts with a strong model pre-trained on a large number of base classes.
Our proposed method is robust and generalizes to all analyzed CIL settings.
arXiv Detail & Related papers (2022-04-07T17:58:07Z) - Policy Gradient Methods for the Noisy Linear Quadratic Regulator over a
Finite Horizon [3.867363075280544]
We explore reinforcement learning methods for finding the optimal policy in the linear quadratic regulator (LQR) problem.
We produce a global linear convergence guarantee for the setting of finite time horizon and state dynamics under weak assumptions.
We show results for the case where we assume a model for the underlying dynamics and where we apply the method to the data directly.
arXiv Detail & Related papers (2020-11-20T09:51:49Z) - Provable Hierarchical Imitation Learning via EM [2.864550757598007]
We consider learning an options-type hierarchical policy from expert demonstrations.
We characterize the EM approach proposed by Daniel et al.
We prove that the proposed algorithm converges with high probability to a norm ball around the true parameter.
arXiv Detail & Related papers (2020-10-07T03:21:57Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.