An Energy-Based Prior for Generative Saliency
- URL: http://arxiv.org/abs/2204.08803v3
- Date: Tue, 27 Jun 2023 06:51:25 GMT
- Title: An Energy-Based Prior for Generative Saliency
- Authors: Jing Zhang, Jianwen Xie, Nick Barnes, Ping Li
- Abstract summary: We propose a novel generative saliency prediction framework that adopts an informative energy-based model as a prior distribution.
With the generative saliency model, we can obtain a pixel-wise uncertainty map from an image, indicating model confidence in the saliency prediction.
Experimental results show that our generative saliency model with an energy-based prior can achieve not only accurate saliency predictions but also reliable uncertainty maps consistent with human perception.
- Score: 62.79775297611203
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel generative saliency prediction framework that adopts an
informative energy-based model as a prior distribution. The energy-based prior
model is defined on the latent space of a saliency generator network that
generates the saliency map based on a continuous latent variables and an
observed image. Both the parameters of saliency generator and the energy-based
prior are jointly trained via Markov chain Monte Carlo-based maximum likelihood
estimation, in which the sampling from the intractable posterior and prior
distributions of the latent variables are performed by Langevin dynamics. With
the generative saliency model, we can obtain a pixel-wise uncertainty map from
an image, indicating model confidence in the saliency prediction. Different
from existing generative models, which define the prior distribution of the
latent variables as a simple isotropic Gaussian distribution, our model uses an
energy-based informative prior which can be more expressive in capturing the
latent space of the data. With the informative energy-based prior, we extend
the Gaussian distribution assumption of generative models to achieve a more
representative distribution of the latent space, leading to more reliable
uncertainty estimation. We apply the proposed frameworks to both RGB and RGB-D
salient object detection tasks with both transformer and convolutional neural
network backbones. We further propose an adversarial learning algorithm and a
variational inference algorithm as alternatives to train the proposed
generative framework. Experimental results show that our generative saliency
model with an energy-based prior can achieve not only accurate saliency
predictions but also reliable uncertainty maps that are consistent with human
perception. Results and code are available at
\url{https://github.com/JingZhang617/EBMGSOD}.
Related papers
- Latent Space Energy-based Neural ODEs [73.01344439786524]
This paper introduces a novel family of deep dynamical models designed to represent continuous-time sequence data.
We train the model using maximum likelihood estimation with Markov chain Monte Carlo.
Experiments on oscillating systems, videos and real-world state sequences (MuJoCo) illustrate that ODEs with the learnable energy-based prior outperform existing counterparts.
arXiv Detail & Related papers (2024-09-05T18:14:22Z) - Correntropy-Based Improper Likelihood Model for Robust Electrophysiological Source Imaging [18.298620404141047]
Existing source imaging algorithms utilize the Gaussian assumption for the observation noise to build the likelihood function for Bayesian inference.
The electromagnetic measurements of brain activity are usually affected by miscellaneous artifacts, leading to a potentially non-Gaussian distribution for the observation noise.
We propose a new likelihood model which is robust with respect to non-Gaussian noises.
arXiv Detail & Related papers (2024-08-27T07:54:15Z) - A Non-negative VAE:the Generalized Gamma Belief Network [49.970917207211556]
The gamma belief network (GBN) has demonstrated its potential for uncovering multi-layer interpretable latent representations in text data.
We introduce the generalized gamma belief network (Generalized GBN) in this paper, which extends the original linear generative model to a more expressive non-linear generative model.
We also propose an upward-downward Weibull inference network to approximate the posterior distribution of the latent variables.
arXiv Detail & Related papers (2024-08-06T18:18:37Z) - Variational Potential Flow: A Novel Probabilistic Framework for Energy-Based Generative Modelling [10.926841288976684]
We present a novel energy-based generative framework, Variational Potential Flow (VAPO)
VAPO aims to learn a potential energy function whose gradient (flow) guides the prior samples, so that their density evolution closely follows an approximate data likelihood homotopy.
Images can be generated after training the potential energy, by initializing the samples from Gaussian prior and solving the ODE governing the potential flow on a fixed time interval.
arXiv Detail & Related papers (2024-07-21T18:08:12Z) - Exploiting Diffusion Prior for Generalizable Dense Prediction [85.4563592053464]
Recent advanced Text-to-Image (T2I) diffusion models are sometimes too imaginative for existing off-the-shelf dense predictors to estimate.
We introduce DMP, a pipeline utilizing pre-trained T2I models as a prior for dense prediction tasks.
Despite limited-domain training data, the approach yields faithful estimations for arbitrary images, surpassing existing state-of-the-art algorithms.
arXiv Detail & Related papers (2023-11-30T18:59:44Z) - Accurate generation of stochastic dynamics based on multi-model
Generative Adversarial Networks [0.0]
Generative Adversarial Networks (GANs) have shown immense potential in fields such as text and image generation.
Here we quantitatively test this approach by applying it to a prototypical process on a lattice.
Importantly, the discreteness of the model is retained despite the noise.
arXiv Detail & Related papers (2023-05-25T10:41:02Z) - Learning Generative Vision Transformer with Energy-Based Latent Space
for Saliency Prediction [51.80191416661064]
We propose a novel vision transformer with latent variables following an informative energy-based prior for salient object detection.
Both the vision transformer network and the energy-based prior model are jointly trained via Markov chain Monte Carlo-based maximum likelihood estimation.
With the generative vision transformer, we can easily obtain a pixel-wise uncertainty map from an image, which indicates the model confidence in predicting saliency from the image.
arXiv Detail & Related papers (2021-12-27T06:04:33Z) - Energy-Based Generative Cooperative Saliency Prediction [44.85865238229076]
We study the saliency prediction problem from the perspective of generative models.
We propose a generative cooperative saliency prediction framework based on the generative cooperative networks.
Experimental results show that our generative model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2021-06-25T02:11:50Z) - Uncertainty Inspired RGB-D Saliency Detection [70.50583438784571]
We propose the first framework to employ uncertainty for RGB-D saliency detection by learning from the data labeling process.
Inspired by the saliency data labeling process, we propose a generative architecture to achieve probabilistic RGB-D saliency detection.
Results on six challenging RGB-D benchmark datasets show our approach's superior performance in learning the distribution of saliency maps.
arXiv Detail & Related papers (2020-09-07T13:01:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.