Estimating Model Uncertainty of Neural Networks in Sparse Information
Form
- URL: http://arxiv.org/abs/2006.11631v1
- Date: Sat, 20 Jun 2020 18:09:59 GMT
- Title: Estimating Model Uncertainty of Neural Networks in Sparse Information
Form
- Authors: Jongseok Lee, Matthias Humt, Jianxiang Feng, Rudolph Triebel
- Abstract summary: We present a sparse representation of model uncertainty for Deep Neural Networks (DNNs)
The key insight of our work is that the information matrix tends to be sparse in its spectrum.
We show that the information form can be scalably applied to represent model uncertainty in DNNs.
- Score: 39.553268191681376
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a sparse representation of model uncertainty for Deep Neural
Networks (DNNs) where the parameter posterior is approximated with an inverse
formulation of the Multivariate Normal Distribution (MND), also known as the
information form. The key insight of our work is that the information matrix,
i.e. the inverse of the covariance matrix tends to be sparse in its spectrum.
Therefore, dimensionality reduction techniques such as low rank approximations
(LRA) can be effectively exploited. To achieve this, we develop a novel
sparsification algorithm and derive a cost-effective analytical sampler. As a
result, we show that the information form can be scalably applied to represent
model uncertainty in DNNs. Our exhaustive theoretical analysis and empirical
evaluations on various benchmarks show the competitiveness of our approach over
the current methods.
Related papers
- On Discriminative Probabilistic Modeling for Self-Supervised Representation Learning [85.75164588939185]
We study the discriminative probabilistic modeling problem on a continuous domain for (multimodal) self-supervised representation learning.
We conduct generalization error analysis to reveal the limitation of current InfoNCE-based contrastive loss for self-supervised representation learning.
arXiv Detail & Related papers (2024-10-11T18:02:46Z) - NeurAM: nonlinear dimensionality reduction for uncertainty quantification through neural active manifolds [0.6990493129893112]
We leverage autoencoders to discover a one-dimensional neural active manifold (NeurAM) capturing the model output variability.
We show how NeurAM can be used to obtain multifidelity sampling estimators with reduced variance.
arXiv Detail & Related papers (2024-08-07T04:27:58Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - Reinforcing POD-based model reduction techniques in reaction-diffusion
complex networks using stochastic filtering and pattern recognition [0.09324035015689712]
Complex networks are used to model many real-world systems.
Dimensionality reduction techniques like POD can be used in such cases.
We propose an algorithmic framework that combines techniques from pattern recognition and filtering theory.
arXiv Detail & Related papers (2023-07-19T05:45:05Z) - Proximal Symmetric Non-negative Latent Factor Analysis: A Novel Approach
to Highly-Accurate Representation of Undirected Weighted Networks [2.1797442801107056]
Undirected Weighted Network (UWN) is commonly found in big data-related applications.
Existing models fail in either modeling its intrinsic symmetry or low-data density.
Proximal Symmetric Nonnegative Latent-factor-analysis model is proposed.
arXiv Detail & Related papers (2023-06-06T13:03:24Z) - Robust lEarned Shrinkage-Thresholding (REST): Robust unrolling for
sparse recover [87.28082715343896]
We consider deep neural networks for solving inverse problems that are robust to forward model mis-specifications.
We design a new robust deep neural network architecture by applying algorithm unfolding techniques to a robust version of the underlying recovery problem.
The proposed REST network is shown to outperform state-of-the-art model-based and data-driven algorithms in both compressive sensing and radar imaging problems.
arXiv Detail & Related papers (2021-10-20T06:15:45Z) - Information Theoretic Structured Generative Modeling [13.117829542251188]
A novel generative model framework called the structured generative model (SGM) is proposed that makes straightforward optimization possible.
The implementation employs a single neural network driven by an orthonormal input to a single white noise source adapted to learn an infinite Gaussian mixture model.
Preliminary results show that SGM significantly improves MINE estimation in terms of data efficiency and variance, conventional and variational Gaussian mixture models, as well as for training adversarial networks.
arXiv Detail & Related papers (2021-10-12T07:44:18Z) - MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood
Inference from Sampled Trajectories [61.3299263929289]
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice.
One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio.
We show that this approach can be formulated in terms of mutual information between model parameters and simulated data.
arXiv Detail & Related papers (2021-06-03T12:59:16Z) - Bayesian Imaging With Data-Driven Priors Encoded by Neural Networks:
Theory, Methods, and Algorithms [2.266704469122763]
This paper proposes a new methodology for performing Bayesian inference in imaging inverse problems where the prior knowledge is available in the form of training data.
We establish the existence and well-posedness of the associated posterior moments under easily verifiable conditions.
A model accuracy analysis suggests that the Bayesian probability probabilities reported by the data-driven models are also remarkably accurate under a frequentist definition.
arXiv Detail & Related papers (2021-03-18T11:34:08Z) - GELATO: Geometrically Enriched Latent Model for Offline Reinforcement
Learning [54.291331971813364]
offline reinforcement learning approaches can be divided into proximal and uncertainty-aware methods.
In this work, we demonstrate the benefit of combining the two in a latent variational model.
Our proposed metrics measure both the quality of out of distribution samples as well as the discrepancy of examples in the data.
arXiv Detail & Related papers (2021-02-22T19:42:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.