Related papers: MAP Estimation with Denoisers: Convergence Rates and Guarantees

MAP Estimation with Denoisers: Convergence Rates and Guarantees

URL: http://arxiv.org/abs/2507.15397v1
Date: Mon, 21 Jul 2025 08:59:33 GMT
Title: MAP Estimation with Denoisers: Convergence Rates and Guarantees
Authors: Scott Pesme, Giacomo Meanti, Michael Arbel, Julien Mairal,
Abstract summary: We show that a simple algorithm converges to the proximal operator under a log-concavity assumption on the prior $p$.<n>We show that this algorithm can be interpreted as a gradient descent on smoothed proximal objectives.
Score: 37.88502562012743
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Denoiser models have become powerful tools for inverse problems, enabling the use of pretrained networks to approximate the score of a smoothed prior distribution. These models are often used in heuristic iterative schemes aimed at solving Maximum a Posteriori (MAP) optimisation problems, where the proximal operator of the negative log-prior plays a central role. In practice, this operator is intractable, and practitioners plug in a pretrained denoiser as a surrogate-despite the lack of general theoretical justification for this substitution. In this work, we show that a simple algorithm, closely related to several used in practice, provably converges to the proximal operator under a log-concavity assumption on the prior $p$. We show that this algorithm can be interpreted as a gradient descent on smoothed proximal objectives. Our analysis thus provides a theoretical foundation for a class of empirically successful but previously heuristic methods.

Related papers

Why Heuristic Weighting Works: A Theoretical Analysis of Denoising Score Matching [7.800608181419997]
weighting function has been used for the denoising score matching loss without formal justification.<n>In this work, we demonstrate that heterosasticity is an inherent property of the denoising score matching objective.<n>This insight leads to a principled derivation of optimal weighting functions for generalized, arbitrary-order denoising score matching losses.
arXiv Detail & Related papers (2025-08-03T05:35:20Z)
Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing [58.52119063742121]
Retraining a model using its own predictions together with the original, potentially noisy labels is a well-known strategy for improving the model performance.<n>This paper addresses the question of how to optimally combine the model's predictions and the provided labels.<n>Our main contribution is the derivation of the Bayes optimal aggregator function to combine the current model's predictions and the given labels.
arXiv Detail & Related papers (2025-05-21T07:16:44Z)
Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems. We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z)
Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training. We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z)
Conjugated Discrete Distributions for Distributional Reinforcement Learning [0.0]
We show that one of the most successful methods may not yield an optimal policy if we have a non-deterministic process. We argue that distributional reinforcement learning lends itself to remedy this situation completely.
arXiv Detail & Related papers (2021-12-14T14:14:49Z)
Beyond variance reduction: Understanding the true impact of baselines on policy optimization [24.09670734037029]
We show that learning dynamics are governed by the curvature of the loss function and the noise of the gradient estimates. We present theoretical results showing that, at least for bandit problems, curvature and noise are not sufficient to explain the learning dynamics.
arXiv Detail & Related papers (2020-08-31T17:52:09Z)
Wasserstein-based Projections with Applications to Inverse Problems [25.517805029337254]
In problems consist of recovering a signal gap from a collection of noisy measurements. Recent Plug-and-Play works propose replacing an operator for analytic regularization in optimization methods by a data-driven denoiser. We present a new algorithm that takes samples from the manifold of true data as input and outputs an approximation of the projection operator onto this manifold.
arXiv Detail & Related papers (2020-08-05T15:58:55Z)
Approximation Algorithms for Sparse Principal Component Analysis [57.5357874512594]
Principal component analysis (PCA) is a widely used dimension reduction technique in machine learning and statistics. Various approaches to obtain sparse principal direction loadings have been proposed, which are termed Sparse Principal Component Analysis. We present thresholding as a provably accurate, time, approximation algorithm for the SPCA problem.
arXiv Detail & Related papers (2020-06-23T04:25:36Z)
On the Convergence Rate of Projected Gradient Descent for a Back-Projection based Objective [58.33065918353532]
We consider a back-projection based fidelity term as an alternative to the common least squares (LS) We show that using the BP term, rather than the LS term, requires fewer iterations of optimization algorithms.
arXiv Detail & Related papers (2020-05-03T00:58:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.