Solving Bayesian inverse problems with diffusion priors and off-policy RL
- URL: http://arxiv.org/abs/2503.09746v1
- Date: Wed, 12 Mar 2025 18:45:22 GMT
- Title: Solving Bayesian inverse problems with diffusion priors and off-policy RL
- Authors: Luca Scimeca, Siddarth Venkatraman, Moksh Jain, Minsu Kim, Marcin Sendera, Mohsin Hasan, Luke Rowe, Sarthak Mittal, Pablo Lemos, Emmanuel Bengio, Alexandre Adam, Jarrid Rector-Brooks, Yashar Hezaveh, Laurence Perreault-Levasseur, Yoshua Bengio, Glen Berseth, Nikolay Malkin,
- Abstract summary: Relative Trajectory Balance (RTB) is an off-policy reinforcement learning objective that canally solve inverse problems optimally.<n>We extend the original work by using RTB to train conditional diffusion model posteriors from pretrained unconditional priors for challenging linear and non-linear inverse problems in vision, and science.
- Score: 86.65351676007721
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a practical application of Relative Trajectory Balance (RTB), a recently introduced off-policy reinforcement learning (RL) objective that can asymptotically solve Bayesian inverse problems optimally. We extend the original work by using RTB to train conditional diffusion model posteriors from pretrained unconditional priors for challenging linear and non-linear inverse problems in vision, and science. We use the objective alongside techniques such as off-policy backtracking exploration to improve training. Importantly, our results show that existing training-free diffusion posterior methods struggle to perform effective posterior inference in latent space due to inherent biases.
Related papers
- Learn Hard Problems During RL with Reference Guided Fine-tuning [56.56461712665904]
Reinforcement learning (RL) for mathematical reasoning can suffer from reward sparsity.<n>We introduce Reference-Guided Fine-Tuning (ReGFT) to synthesize positive trajectories on hard problems and train on them before RL.<n>Our results show that ReGFT effectively overcomes reward sparsity and unlocks stronger RL-based mathematical reasoning.
arXiv Detail & Related papers (2026-03-01T18:41:28Z) - Reuse your FLOPs: Scaling RL on Hard Problems by Conditioning on Very Off-Policy Prefixes [22.721425502443253]
We introduce PrefixRL, where we condition on the prefix of successful off-policy traces and run on-policy RL to complete them.<n>PrefixRL boosts the learning signal on hard problems by modulating the difficulty of the problem through the off-policy prefix length.<n>We prove that the PrefixRL objective is not only consistent with the standard RL objective but also more efficient.
arXiv Detail & Related papers (2026-01-26T18:57:00Z) - FlowLPS: Langevin-Proximal Sampling for Flow-based Inverse Problem Solvers [51.56484100374058]
FlowLPS solves inverse problems with pretrained flow models via a Langevin Proximal Sampling (LPS) strategy.<n>Our method integrates Langevin dynamics for manifold-consistent exploration with proximal optimization for precise mode seeking, achieving a superior balance between reconstruction fidelity and perceptual quality.
arXiv Detail & Related papers (2025-12-08T04:18:13Z) - Solving Diffusion Inverse Problems with Restart Posterior Sampling [2.9527010146189556]
Inverse problems are fundamental to science and engineering, where the goal is to infer an underlying signal or state from noisy measurements.<n>Recent approaches employ diffusion models as powerful implicit priors for such problems, owing to their ability to capture complex data distributions.<n>We propose Restart for Posterior Sampling (RePS), a general and efficient framework for solving both linear and non-linear inverse problems.
arXiv Detail & Related papers (2025-11-24T20:42:33Z) - Latent Refinement via Flow Matching for Training-free Linear Inverse Problem Solving [18.226350407462643]
We propose LFlow, a training-free framework for solving linear inverse problems via pretrained latent flow priors.<n>Our proposed method outperforms state-of-the-art latent diffusion solvers in reconstruction quality across most tasks.
arXiv Detail & Related papers (2025-11-08T21:20:59Z) - PRISM: Probabilistic and Robust Inverse Solver with Measurement-Conditioned Diffusion Prior for Blind Inverse Problems [9.499981934098722]
We introduce a novel probabilistic and robust inverse solver with measurement-conditioned diffusion prior (PRISM)<n>PRISM offers a technical advancement over current methods by incorporating a powerful measurement-conditioned diffusion model into a theoretically principled posterior sampling scheme.<n> Experiments on blind image deblurring validate the effectiveness of the proposed method, demonstrating the superior performance of PRISM over state-of-the-art baselines in both image and blur kernel recovery.
arXiv Detail & Related papers (2025-09-19T15:49:03Z) - Projection-Based Correction for Enhancing Deep Inverse Networks [3.5534933448684134]
We introduce a projection-based correction method to enhance the inference of deep inverse networks.<n>We theoretically demonstrate that if the recovery model is a well-trained deep inverse network, the solution can be decomposed into range-space and null-space components.
arXiv Detail & Related papers (2025-05-21T17:28:14Z) - Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning [54.07840818762834]
Conditional decision generation with diffusion models has shown powerful competitiveness in reinforcement learning (RL)<n>Recent studies reveal the relation between energy-function-guidance diffusion models and constrained RL problems.<n>Main challenge lies in estimating the intermediate energy, which is intractable due to the log-expectation formulation during the generation process.
arXiv Detail & Related papers (2025-05-03T14:00:25Z) - Boosting Generalization in Diffusion-Based Neural Combinatorial Solver via Inference Time Adaptation [27.898573891403075]
Diffusion-based Neural Combinatorial Optimization (NCO) has demonstrated effectiveness in solving NP-complete (NPC) problems by learning discrete diffusion models for solution generation, eliminating hand-crafted domain knowledge.<n>Existing NCO methods face challenges in both cross-scale and cross-problem generalization, and high training costs compared to traditional solvers.<n>We propose a training-free inference time adaptation framework (DIFU-Ada) that enables both the zero-shot cross-problem transfer and cross-scale generalization capabilities of diffusion-based NCO solvers without requiring additional training.
arXiv Detail & Related papers (2025-02-15T08:04:00Z) - Exploratory Diffusion Policy for Unsupervised Reinforcement Learning [28.413426177336703]
Unsupervised reinforcement learning aims to pre-train agents by exploring states or skills in reward-free environments.
Existing methods often overlook the fitting ability of pre-trained policies and struggle to handle the heterogeneous pre-training data.
We propose Exploratory Diffusion Policy (EDP), which leverages the strong expressive ability of diffusion models to fit the explored data.
arXiv Detail & Related papers (2025-02-11T05:48:51Z) - DDRM-PR: Fourier Phase Retrieval using Denoising Diffusion Restoration Models [0.0]
This paper exploits the efficient and unsupervised posterior sampling framework of Denoising Diffusion Restoration Models.<n>The approach combines the model-based alternating-projection methods with the DDRM to utilize pretrained unconditional diffusion priors for phase retrieval.
arXiv Detail & Related papers (2025-01-06T14:18:23Z) - A Survey on Diffusion Models for Inverse Problems [110.6628926886398]
We provide an overview of methods that utilize pre-trained diffusion models to solve inverse problems without requiring further training.
We discuss specific challenges and potential solutions associated with using latent diffusion models for inverse problems.
arXiv Detail & Related papers (2024-09-30T17:34:01Z) - Amortized Posterior Sampling with Diffusion Prior Distillation [55.03585818289934]
Amortized Posterior Sampling is a novel variational inference approach for efficient posterior sampling in inverse problems.<n>Our method trains a conditional flow model to minimize the divergence between the variational distribution and the posterior distribution implicitly defined by the diffusion model.<n>Unlike existing methods, our approach is unsupervised, requires no paired training data, and is applicable to both Euclidean and non-Euclidean domains.
arXiv Detail & Related papers (2024-07-25T09:53:12Z) - Diffusion Prior-Based Amortized Variational Inference for Noisy Inverse Problems [12.482127049881026]
We propose a novel approach to solve inverse problems with a diffusion prior from an amortized variational inference perspective.
Our amortized inference learns a function that directly maps measurements to the implicit posterior distributions of corresponding clean data, enabling a single-step posterior sampling even for unseen measurements.
arXiv Detail & Related papers (2024-07-23T02:14:18Z) - Amortizing intractable inference in diffusion models for vision, language, and control [89.65631572949702]
This paper studies amortized sampling of the posterior over data, $mathbfxsim prm post(mathbfx)propto p(mathbfx)r(mathbfx)$, in a model that consists of a diffusion generative model prior $p(mathbfx)$ and a black-box constraint or function $r(mathbfx)$.
We prove the correctness of a data-free learning objective, relative trajectory balance, for training a diffusion model that samples from
arXiv Detail & Related papers (2024-05-31T16:18:46Z) - Learning Diffusion Priors from Observations by Expectation Maximization [6.224769485481242]
We present a novel method based on the expectation-maximization algorithm for training diffusion models from incomplete and noisy observations only.
As part of our method, we propose and motivate an improved posterior sampling scheme for unconditional diffusion models.
arXiv Detail & Related papers (2024-05-22T15:04:06Z) - Improving Diffusion Models for Inverse Problems Using Optimal Posterior Covariance [52.093434664236014]
Recent diffusion models provide a promising zero-shot solution to noisy linear inverse problems without retraining for specific inverse problems.
Inspired by this finding, we propose to improve recent methods by using more principled covariance determined by maximum likelihood estimation.
arXiv Detail & Related papers (2024-02-03T13:35:39Z) - Unsupervised Discovery of Interpretable Directions in h-space of
Pre-trained Diffusion Models [63.1637853118899]
We propose the first unsupervised and learning-based method to identify interpretable directions in h-space of pre-trained diffusion models.
We employ a shift control module that works on h-space of pre-trained diffusion models to manipulate a sample into a shifted version of itself.
By jointly optimizing them, the model will spontaneously discover disentangled and interpretable directions.
arXiv Detail & Related papers (2023-10-15T18:44:30Z) - Unmasking Bias in Diffusion Model Training [40.90066994983719]
Denoising diffusion models have emerged as a dominant approach for image generation.
They still suffer from slow convergence in training and color shift issues in sampling.
In this paper, we identify that these obstacles can be largely attributed to bias and suboptimality inherent in the default training paradigm.
arXiv Detail & Related papers (2023-10-12T16:04:41Z) - Training-free Linear Image Inverses via Flows [17.291903204982326]
We propose a training-free method for solving linear inverse problems by using pretrained flow models.
Our approach requires no problem-specific tuning across an extensive suite of noisy linear inverse problems on high-dimensional datasets.
arXiv Detail & Related papers (2023-09-25T22:13:16Z) - Diffusion Policies as an Expressive Policy Class for Offline
Reinforcement Learning [70.20191211010847]
Offline reinforcement learning (RL) aims to learn an optimal policy using a previously collected static dataset.
We introduce Diffusion Q-learning (Diffusion-QL) that utilizes a conditional diffusion model to represent the policy.
We show that our method can achieve state-of-the-art performance on the majority of the D4RL benchmark tasks.
arXiv Detail & Related papers (2022-08-12T09:54:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.