Learning Causal States Under Partial Observability and Perturbation
- URL: http://arxiv.org/abs/2512.00357v1
- Date: Sat, 29 Nov 2025 06:56:03 GMT
- Title: Learning Causal States Under Partial Observability and Perturbation
- Authors: Na Li, Hangguan Shan, Wei Ni, Wenjie Zhang, Xinyu Li, Yamin Wang,
- Abstract summary: Existing methods fail to mitigate perturbations while addressing partial observability.<n>We propose textitCausal State Representation under Asynchronous Diffusion Model (CaDiff)<n>CaDiff is the first framework that approximates causal states using diffusion models with both theoretical rigor and practicality.
- Score: 29.533770208192845
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A critical challenge for reinforcement learning (RL) is making decisions based on incomplete and noisy observations, especially in perturbed and partially observable Markov decision processes (P$^2$OMDPs). Existing methods fail to mitigate perturbations while addressing partial observability. We propose \textit{Causal State Representation under Asynchronous Diffusion Model (CaDiff)}, a framework that enhances any RL algorithm by uncovering the underlying causal structure of P$^2$OMDPs. This is achieved by incorporating a novel asynchronous diffusion model (ADM) and a new bisimulation metric. ADM enables forward and reverse processes with different numbers of steps, thus interpreting the perturbation of P$^2$OMDP as part of the noise suppressed through diffusion. The bisimulation metric quantifies the similarity between partially observable environments and their causal counterparts. Moreover, we establish the theoretical guarantee of CaDiff by deriving an upper bound for the value function approximation errors between perturbed observations and denoised causal states, reflecting a principled trade-off between approximation errors of reward and transition-model. Experiments on Roboschool tasks show that CaDiff enhances returns by at least 14.18\% compared to baselines. CaDiff is the first framework that approximates causal states using diffusion models with both theoretical rigor and practicality.
Related papers
- Function-Space Decoupled Diffusion for Forward and Inverse Modeling in Carbon Capture and Storage [65.51149575007149]
We present Fun-DDPS, a generative framework that combines function-space diffusion models with differentiable neural operator surrogates for both forward and inverse modeling.<n>Fun-DDPS produces physically consistent realizations free from the high-frequency artifacts observed in joint-state baselines.
arXiv Detail & Related papers (2026-02-12T18:58:12Z) - Geometry of Uncertainty: Learning Metric Spaces for Multimodal State Estimation in RL [13.394337316697241]
Estimating the state of an environment from high-dimensional, multimodal, and noisy observations is a fundamental challenge in reinforcement learning (RL)<n>Traditional approaches rely on probabilistic models to account for the uncertainty, but often require explicit noise assumptions.<n>We present a novel method to learn a structured latent representation, in which distances between states correlate with the minimum number of actions required to transition between them.
arXiv Detail & Related papers (2026-02-12T15:41:20Z) - FlexCausal: Flexible Causal Disentanglement via Structural Flow Priors and Manifold-Aware Interventions [1.7114074082429929]
Causal Disentangled Representation Learning aims to learn and disentangle low dimensional representations from observations.<n>We propose FlexCausal, a novel CDRL framework based on a block-diagonal covariance VAE.<n>Our framework ensures a precise structural correspondence between the learned latent subspaces and the ground-truth causal relations.
arXiv Detail & Related papers (2026-01-29T11:30:53Z) - Score-based Membership Inference on Diffusion Models [3.742113529511043]
Membership inference attacks (MIAs) against diffusion models have emerged as a pressing privacy concern.<n>We present a theoretical and empirical study of score-based MIAs, focusing on the predicted noise vectors that diffusion models learn to approximate.<n>We show that the expected denoiser output points toward a kernel-weighted local mean of nearby training samples, such that its norm encodes proximity to the training set and thereby reveals membership.
arXiv Detail & Related papers (2025-09-29T16:28:55Z) - Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective.<n>The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning.<n>The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z) - Bridging Internal Probability and Self-Consistency for Effective and Efficient LLM Reasoning [53.25336975467293]
We present the first theoretical error decomposition analysis of methods such as perplexity and self-consistency.<n>Our analysis reveals a fundamental trade-off: perplexity methods suffer from substantial model error due to the absence of a proper consistency function.<n>We propose Reasoning-Pruning Perplexity Consistency (RPC), which integrates perplexity with self-consistency, and Reasoning Pruning, which eliminates low-probability reasoning paths.
arXiv Detail & Related papers (2025-02-01T18:09:49Z) - Rectified Diffusion Guidance for Conditional Generation [94.83538269086613]
We revisit the theory behind CFG and rigorously confirm that the improper combination coefficients (textiti.e.) brings about expectation shift the generative distribution.<n>We show that our approach enjoys a textbftextitform solution given the strength.<n> Empirical evidence on real-world data demonstrate the compatibility of our design with existing state-of-the-art diffusion models.
arXiv Detail & Related papers (2024-10-24T13:41:32Z) - On Diffusion Models for Multi-Agent Partial Observability: Shared Attractors, Error Bounds, and Composite Flow [37.433470342139685]
We investigate reconstructing global states from local action-observation histories in Dec-POMDPs using diffusion models.<n>We find that, with deep learning approximation errors, fixed points can deviate from true states and the deviation is negatively correlated to the Jacobian rank.
arXiv Detail & Related papers (2024-10-17T18:23:33Z) - Generative Fractional Diffusion Models [53.36835573822926]
We introduce the first continuous-time score-based generative model that leverages fractional diffusion processes for its underlying dynamics.
Our evaluations on real image datasets demonstrate that GFDM achieves greater pixel-wise diversity and enhanced image quality, as indicated by a lower FID.
arXiv Detail & Related papers (2023-10-26T17:53:24Z) - Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative
Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models.
In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z) - Diffusion Causal Models for Counterfactual Estimation [18.438307666925425]
We consider the task of counterfactual estimation from observational imaging data given a known causal structure.
We propose Diff-SCM, a deep structural causal model that builds on recent advances of generative energy-based models.
We find that Diff-SCM produces more realistic and minimal counterfactuals than baselines on MNIST data and can also be applied to ImageNet data.
arXiv Detail & Related papers (2022-02-21T12:23:01Z) - Estimation of Bivariate Structural Causal Models by Variational Gaussian
Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models.
One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z) - Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues.
We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders.
We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.