Related papers: Identifying Causal Direction via Variational Bayesian Compression

Identifying Causal Direction via Variational Bayesian Compression

URL: http://arxiv.org/abs/2505.07503v4
Date: Thu, 14 Aug 2025 01:13:43 GMT
Title: Identifying Causal Direction via Variational Bayesian Compression
Authors: Quang-Duy Tran, Bao Duong, Phuoc Nguyen, Thin Nguyen,
Abstract summary: A key principle is the algorithmic Markov condition, which postulates that the joint distribution, when factorized according to the causal direction, yields a more succinct codelength.<n>Previous approaches approximate these codelengths by relying on simple functions or Gaussian processes (GPs) with easily evaluable complexity.<n>We propose leveraging the variational Bayesian learning of neural networks as an interpretation of the codelengths.
Score: 6.928582707713723
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Telling apart the cause and effect between two random variables with purely observational data is a challenging problem that finds applications in various scientific disciplines. A key principle utilized in this task is the algorithmic Markov condition, which postulates that the joint distribution, when factorized according to the causal direction, yields a more succinct codelength compared to the anti-causal direction. Previous approaches approximate these codelengths by relying on simple functions or Gaussian processes (GPs) with easily evaluable complexity, compromising between model fitness and computational complexity. To address these limitations, we propose leveraging the variational Bayesian learning of neural networks as an interpretation of the codelengths. This allows the improvement of model fitness, while maintaining the succinctness of the codelengths, and the avoidance of the significant computational complexity of the GP-based approaches. Extensive experiments on both synthetic and real-world benchmarks in cause-effect identification demonstrate the effectiveness of our proposed method, showing promising performance enhancements on several datasets in comparison to most related methods.

Related papers

Enhanced Digitized Adiabatic Quantum Factorization Algorithm Using Null-Space Encoding [0.0]
We propose a modified QAOA-based factorization protocol that simplifies the interacting Hamiltonian to include only two-body terms.<n>We show that this method achieves comparable or higher fidelities than the standard protocol.<n>We also report on simulations with alternative cost-function definitions that frequently yielded improved performance.
arXiv Detail & Related papers (2025-11-13T16:24:07Z)
Quantum circuit evolutionary framework applied on set partitioning problem [0.0]
We present a framework based on circuits with variable topology with two approaches.<n>The efficiency of the proposed framework was tested on several instances of the set partitioning problem.<n> Notably, the strategy employing a pseudo-counterdiabatic evolutionary term exhibited remarkable performance.
arXiv Detail & Related papers (2025-07-28T12:42:13Z)
Binarized Neural Networks Converge Toward Algorithmic Simplicity: Empirical Support for the Learning-as-Compression Hypothesis [36.24954635616374]
We propose a shift toward algorithmic information theory, using Binarized Neural Networks (BNNs) as a first proxy.<n>We apply the Block Decomposition Method (BDM) and demonstrate it more closely tracks structural changes during training than entropy.<n>These results support the view of training as a process of algorithmic compression, where learning corresponds to the progressive internalization of structured regularities.
arXiv Detail & Related papers (2025-05-27T02:51:36Z)
PHEATPRUNER: Interpretable Data-centric Feature Selection for Multivariate Time Series Classification through Persistent Homology [0.0937465283958018]
PHeatPruner is a method to balance performance and interpretability in time series classification.<n> Persistent homology facilitates the pruning of up to 45% of the applied variables.<n>Sheaf theory contributes explanatory vectors that provide deeper insights into the data's structural nuances.
arXiv Detail & Related papers (2025-04-25T13:14:11Z)
Quantum-enhanced causal discovery for a small number of samples [0.6665442733069773]
This study proposes a novel quantum Peter-Clark (qPC) algorithm for causal discovery that does not assume any underlying model structures.<n>We conducted systematic experiments on fundamental graph parts of causal structures, demonstrating that the qPC algorithm exhibits a significantly better performance.<n>Our theoretical and experimental results demonstrate that the proposed quantum algorithm can empower classical algorithms for robust and accurate inference.
arXiv Detail & Related papers (2025-01-09T07:05:22Z)
Overcoming the Curse of Dimensionality in Reinforcement Learning Through Approximate Factorization [15.898378661128334]
Reinforcement Learning (RL) algorithms are known to suffer from the curse of dimensionality. We propose overcoming the curse of dimensionality by approximately factorizing the original Markov decision processes (MDPs) into smaller, independently evolving MDPs. We provide improved sample complexity guarantees for both proposed algorithms.
arXiv Detail & Related papers (2024-11-12T07:08:00Z)
A Theoretical Perspective for Speculative Decoding Algorithm [60.79447486066416]
One effective way to accelerate inference is emphSpeculative Decoding, which employs a small model to sample a sequence of draft tokens and a large model to validate. This paper tackles this gap by conceptualizing the decoding problem via markov chain abstraction and studying the key properties, emphoutput quality and inference acceleration, from a theoretical perspective.
arXiv Detail & Related papers (2024-10-30T01:53:04Z)
Large-Scale Targeted Cause Discovery via Learning from Simulated Data [66.51307552703685]
We propose a novel machine learning approach for inferring causal variables of a target variable from observations.<n>We train a neural network using supervised learning on simulated data to infer causality.<n> Empirical results demonstrate superior performance in identifying causal relationships within large-scale gene regulatory networks.
arXiv Detail & Related papers (2024-08-29T02:21:11Z)
Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning [50.84938730450622]
We propose a trajectory-based method TV score, which uses trajectory volatility for OOD detection in mathematical reasoning. Our method outperforms all traditional algorithms on GLMs under mathematical reasoning scenarios. Our method can be extended to more applications with high-density features in output spaces, such as multiple-choice questions.
arXiv Detail & Related papers (2024-05-22T22:22:25Z)
From Causal Pairs to Causal Graphs [1.5469452301122175]
Causal structure learning from observational data remains a non-trivial task. Motivated by the Cause-Effect Pair' NIPS 2013 Workshop on Causality Challenge, we take a different approach and generate a probability distribution over all possible graphs. The goal of the paper is to propose new methods based on this probabilistic information and compare their performance with traditional and state-of-the-art approaches.
arXiv Detail & Related papers (2022-11-08T15:28:55Z)
Generalization of Neural Combinatorial Solvers Through the Lens of Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features. We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features. Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound. Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z)
Harnessing Heterogeneity: Learning from Decomposed Feedback in Bayesian Modeling [68.69431580852535]
We introduce a novel GP regression to incorporate the subgroup feedback. Our modified regression has provably lower variance -- and thus a more accurate posterior -- compared to previous approaches. We execute our algorithm on two disparate social problems.
arXiv Detail & Related papers (2021-07-07T03:57:22Z)
Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure. We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z)
Joint Network Topology Inference via Structured Fusion Regularization [70.30364652829164]
Joint network topology inference represents a canonical problem of learning multiple graph Laplacian matrices from heterogeneous graph signals. We propose a general graph estimator based on a novel structured fusion regularization. We show that the proposed graph estimator enjoys both high computational efficiency and rigorous theoretical guarantee.
arXiv Detail & Related papers (2021-03-05T04:42:32Z)
Parsimonious Inference [0.0]
Parsimonious inference is an information-theoretic formulation of inference over arbitrary architectures. Our approaches combine efficient encodings with prudent sampling strategies to construct predictive ensembles without cross-validation.
arXiv Detail & Related papers (2021-03-03T04:13:14Z)
Efficient Consensus Model based on Proximal Gradient Method applied to Convolutional Sparse Problems [2.335152769484957]
We derive and detail a theoretical analysis of an efficient consensus algorithm based on gradient proximal (PG) approach. The proposed algorithm is also applied to another particular convolutional problem for the anomaly detection task.
arXiv Detail & Related papers (2020-11-19T20:52:48Z)
Scalable Gaussian Process Variational Autoencoders [17.345687261000045]
We propose a new scalable GP-VAE model that outperforms existing approaches in terms of runtime and memory footprint, is easy to implement, and allows for joint end-to-end optimization of all components.
arXiv Detail & Related papers (2020-10-26T10:26:02Z)
Differentiable Causal Discovery from Interventional Data [141.41931444927184]
We propose a theoretically-grounded method based on neural networks that can leverage interventional data. We show that our approach compares favorably to the state of the art in a variety of settings.
arXiv Detail & Related papers (2020-07-03T15:19:17Z)
Total Deep Variation: A Stable Regularizer for Inverse Problems [71.90933869570914]
We introduce the data-driven general-purpose total deep variation regularizer. In its core, a convolutional neural network extracts local features on multiple scales and in successive blocks. We achieve state-of-the-art results for numerous imaging tasks.
arXiv Detail & Related papers (2020-06-15T21:54:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.