Related papers: Score-informed Neural Operator for Enhancing Ordering-based Causal Discovery

Score-informed Neural Operator for Enhancing Ordering-based Causal Discovery

URL: http://arxiv.org/abs/2508.12650v2
Date: Mon, 27 Oct 2025 05:20:40 GMT
Title: Score-informed Neural Operator for Enhancing Ordering-based Causal Discovery
Authors: Jiyeon Kang, Songseong Kim, Chanhui Lee, Doyeong Hwang, Joanie Hayoun Chung, Yunkyung Ko, Sumin Lee, Sungwoong Kim, Sungbin Lim,
Abstract summary: We propose Score-informed Neural Operator (SciNO) to approximate the Hessian diagonal of the log-densities.<n>SciNO reduces order divergence by 42.7% on synthetic graphs and by 31.5% on real-world datasets.<n>We also propose a probabilistic control algorithm for causal reasoning with autoregressive models.
Score: 12.33811209316863
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Ordering-based approaches to causal discovery identify topological orders of causal graphs, providing scalable alternatives to combinatorial search methods. Under the Additive Noise Model (ANM) assumption, recent causal ordering methods based on score matching require an accurate estimation of the Hessian diagonal of the log-densities. In this paper, we aim to improve the approximation of the Hessian diagonal of the log-densities, thereby enhancing the performance of ordering-based causal discovery algorithms. Existing approaches that rely on Stein gradient estimators are computationally expensive and memory-intensive, while diffusion-model-based methods remain unstable due to the second-order derivatives of score models. To alleviate these problems, we propose Score-informed Neural Operator (SciNO), a probabilistic generative model in smooth function spaces designed to stably approximate the Hessian diagonal and to preserve structural information during the score modeling. Empirical results show that SciNO reduces order divergence by 42.7% on synthetic graphs and by 31.5% on real-world datasets on average compared to DiffAN, while maintaining memory efficiency and scalability. Furthermore, we propose a probabilistic control algorithm for causal reasoning with autoregressive models that integrates SciNO's probability estimates with autoregressive model priors, enabling reliable data-driven causal ordering informed by semantic information. Consequently, the proposed method enhances causal reasoning abilities of LLMs without additional fine-tuning or prompt engineering.

Related papers

Sparse Additive Model Pruning for Order-Based Causal Structure Learning [4.55744151522751]
Causal structure learning aims to estimate causal relationships between variables as a form of a causal directed acyclic graph (DAG) from observational data.<n>One of the major frameworks is the order-based approach that first estimates a topological order of the underlying DAG and then prunes spurious edges from the fully-connected DAG induced by the estimated topological order.<n>We introduce a new pruning method based on sparse additive models, which enables direct pruning of redundant edges without relying on hypothesis testing.
arXiv Detail & Related papers (2026-02-17T02:06:42Z)
Adaptive Bayesian Optimization for Robust Identification of Stochastic Dynamical Systems [4.0148499400442095]
This paper deals with the identification of linear derivation systems, where the unknowns include system coefficients and noise variances.<n>A sample-efficient global optimization method based on Bayesian optimization is proposed.<n>Experiments show that the EGP-based BO consistently outperforms MLE via steady-state filtering and expectation-maximization.
arXiv Detail & Related papers (2025-03-09T01:38:21Z)
Estimating Causal Effects from Learned Causal Networks [56.14597641617531]
We propose an alternative paradigm for answering causal-effect queries over discrete observable variables. We learn the causal Bayesian network and its confounding latent variables directly from the observational data. We show that this emphmodel completion learning approach can be more effective than estimand approaches.
arXiv Detail & Related papers (2024-08-26T08:39:09Z)
Operator-Informed Score Matching for Markov Diffusion Models [9.680266522150495]
Diffusion models are typically trained using score matching, a learning objective to the underlying noising process that guides the model.<n>This paper argues that Markov noising processes enjoy an advantage over alternatives, as the Markov operators that govern the noising process are well-understood.
arXiv Detail & Related papers (2024-06-13T13:07:52Z)
Efficient Prior Calibration From Indirect Data [5.588334720483076]
This paper is concerned with learning the prior model from data, in particular, learning the prior from multiple realizations of indirect data obtained through the noisy observation process.<n>An efficient residual-based neural operator approximation of the forward model is proposed and it is shown that this may be learned concurrently with the pushforward map.
arXiv Detail & Related papers (2024-05-28T08:34:41Z)
Noise in the reverse process improves the approximation capabilities of diffusion models [27.65800389807353]
In Score based Generative Modeling (SGMs), the state-of-the-art in generative modeling, reverse processes are known to perform better than their deterministic counterparts. This paper delves into the heart of this phenomenon, comparing neural ordinary differential equations (ODEs) and neural dimension equations (SDEs) as reverse processes. We analyze the ability of neural SDEs to approximate trajectories of the Fokker-Planck equation, revealing the advantages of neurality.
arXiv Detail & Related papers (2023-12-13T02:39:10Z)
Learning Unnormalized Statistical Models via Compositional Optimization [73.30514599338407]
Noise-contrastive estimation(NCE) has been proposed by formulating the objective as the logistic loss of the real data and the artificial noise. In this paper, we study it a direct approach for optimizing the negative log-likelihood of unnormalized models.
arXiv Detail & Related papers (2023-06-13T01:18:16Z)
Score-based Diffusion Models in Function Space [137.70916238028306]
Diffusion models have recently emerged as a powerful framework for generative modeling.<n>This work introduces a mathematically rigorous framework called Denoising Diffusion Operators (DDOs) for training diffusion models in function space.<n>We show that the corresponding discretized algorithm generates accurate samples at a fixed cost independent of the data resolution.
arXiv Detail & Related papers (2023-02-14T23:50:53Z)
Score matching enables causal discovery of nonlinear additive noise models [63.93669924730725]
We show how to design a new generation of scalable causal discovery methods. We propose a new efficient method for approximating the score's Jacobian, enabling to recover the causal graph.
arXiv Detail & Related papers (2022-03-08T21:34:46Z)
Diffusion Causal Models for Counterfactual Estimation [18.438307666925425]
We consider the task of counterfactual estimation from observational imaging data given a known causal structure. We propose Diff-SCM, a deep structural causal model that builds on recent advances of generative energy-based models. We find that Diff-SCM produces more realistic and minimal counterfactuals than baselines on MNIST data and can also be applied to ImageNet data.
arXiv Detail & Related papers (2022-02-21T12:23:01Z)
Information Theoretic Structured Generative Modeling [13.117829542251188]
A novel generative model framework called the structured generative model (SGM) is proposed that makes straightforward optimization possible. The implementation employs a single neural network driven by an orthonormal input to a single white noise source adapted to learn an infinite Gaussian mixture model. Preliminary results show that SGM significantly improves MINE estimation in terms of data efficiency and variance, conventional and variational Gaussian mixture models, as well as for training adversarial networks.
arXiv Detail & Related papers (2021-10-12T07:44:18Z)
Estimation of Bivariate Structural Causal Models by Variational Gaussian Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models. One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z)
Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores) For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training. We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z)
SODEN: A Scalable Continuous-Time Survival Model through Ordinary Differential Equation Networks [14.564168076456822]
We propose a flexible model for survival analysis using neural networks along with scalable optimization algorithms. We demonstrate the effectiveness of the proposed method in comparison to existing state-of-the-art deep learning survival analysis models.
arXiv Detail & Related papers (2020-08-19T19:11:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.