Related papers: A Framework for Improving the Reliability of Black-box Variational Inference

A Framework for Improving the Reliability of Black-box Variational Inference

URL: http://arxiv.org/abs/2203.15945v2
Date: Thu, 16 May 2024 16:06:36 GMT
Title: A Framework for Improving the Reliability of Black-box Variational Inference
Authors: Manushi Welandawe, Michael Riis Andersen, Aki Vehtari, Jonathan H. Huggins,
Abstract summary: Black-box variational inference (BBVI) now sees widespread use in machine learning and statistics. We propose Robust and Automated Black-box VI (RABVI), a framework for improving the reliability of BBVI optimization. RABVI is based on rigorously justified automation techniques, includes just a small number of intuitive tuning parameters, and detects inaccurate estimates of the optimal variational approximation.
Score: 9.621959865172549
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Black-box variational inference (BBVI) now sees widespread use in machine learning and statistics as a fast yet flexible alternative to Markov chain Monte Carlo methods for approximate Bayesian inference. However, stochastic optimization methods for BBVI remain unreliable and require substantial expertise and hand-tuning to apply effectively. In this paper, we propose Robust and Automated Black-box VI (RABVI), a framework for improving the reliability of BBVI optimization. RABVI is based on rigorously justified automation techniques, includes just a small number of intuitive tuning parameters, and detects inaccurate estimates of the optimal variational approximation. RABVI adaptively decreases the learning rate by detecting convergence of the fixed--learning-rate iterates, then estimates the symmetrized Kullback--Leibler (KL) divergence between the current variational approximation and the optimal one. It also employs a novel optimization termination criterion that enables the user to balance desired accuracy against computational cost by comparing (i) the predicted relative decrease in the symmetrized KL divergence if a smaller learning were used and (ii) the predicted computation required to converge with the smaller learning rate. We validate the robustness and accuracy of RABVI through carefully designed simulation studies and on a diverse set of real-world model and data examples.

Related papers

Generalized Linear Bandits: Almost Optimal Regret with One-Pass Update [60.414548453838506]
We study the generalized linear bandit (GLB) problem, a contextual multi-armed bandit framework that extends the classical linear model by incorporating a non-linear link function.<n>GLBs are widely applicable to real-world scenarios, but their non-linear nature introduces significant challenges in achieving both computational and statistical efficiency.<n>We propose a jointly efficient algorithm that attains a nearly optimal regret bound with $mathcalO(1)$ time and space complexities per round.
arXiv Detail & Related papers (2025-07-16T02:24:21Z)
Sample Margin-Aware Recalibration of Temperature Scaling [20.87493013833571]
Recent advances in deep learning have significantly improved predictive accuracy.<n>Modern neural networks remain systematically overconfident, posing risks for deployment in safety-critical scenarios.<n>We propose a lightweight, data-efficient recalibration method that precisely scales logits based on the margin between the top two logits.
arXiv Detail & Related papers (2025-06-30T03:35:05Z)
Exploring the Potential of Bilevel Optimization for Calibrating Neural Networks [0.7499722271664147]
Modern neural networks are poorly calibrated, resulting in predicted confidence scores that are difficult to use. This article explores improving confidence estimation and calibration through the application of bilevel optimization. A self-calibrating bilevel neural-network training approach is introduced to improve a model's predicted confidence scores.
arXiv Detail & Related papers (2025-03-17T12:34:55Z)
A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning. These problems are often formalized as Bi-Level optimizations (BLO) We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z)
A constrained optimization approach to improve robustness of neural networks [1.2338729811609355]
We present a novel nonlinear programming-based approach to fine-tune pre-trained neural networks to improve robustness against adversarial attacks while maintaining accuracy on clean data.
arXiv Detail & Related papers (2024-09-18T18:37:14Z)
Gradient-free variational learning with conditional mixture networks [39.827869318925494]
Conditional mixture networks (CMNs) are suitable for fast, gradient-free inference and can solve complex classification tasks. We validate this approach by training two-layer CMNs on standard benchmarks from the UCI repository. Our method, CAVI-CMN, achieves competitive and often superior predictive accuracy compared to maximum likelihood estimation (MLE) with backpropagation.
arXiv Detail & Related papers (2024-08-29T10:43:55Z)
Calibrating Neural Simulation-Based Inference with Differentiable Coverage Probability [50.44439018155837]
We propose to include a calibration term directly into the training objective of the neural model. By introducing a relaxation of the classical formulation of calibration error we enable end-to-end backpropagation. It is directly applicable to existing computational pipelines allowing reliable black-box posterior inference.
arXiv Detail & Related papers (2023-10-20T10:20:45Z)
Manifold Gaussian Variational Bayes on the Precision Matrix [70.44024861252554]
We propose an optimization algorithm for Variational Inference (VI) in complex models. We develop an efficient algorithm for Gaussian Variational Inference whose updates satisfy the positive definite constraint on the variational covariance matrix. Due to its black-box nature, MGVBP stands as a ready-to-use solution for VI in complex models.
arXiv Detail & Related papers (2022-10-26T10:12:31Z)
Scalable Stochastic Parametric Verification with Stochastic Variational Smoothed Model Checking [1.5293427903448025]
Smoothed model checking (smMC) aims at inferring the satisfaction function over the entire parameter space from a limited set of observations. In this paper, we exploit recent advances in probabilistic machine learning to push this limitation forward. We compare the performances of smMC against those of SV-smMC by looking at the scalability, the computational efficiency and the accuracy of the reconstructed satisfaction function.
arXiv Detail & Related papers (2022-05-11T10:43:23Z)
A Free Lunch with Influence Functions? Improving Neural Network Estimates with Concepts from Semiparametric Statistics [41.99023989695363]
We explore the potential for semiparametric theory to be used to improve neural networks and machine learning algorithms. We propose a new neural network method MultiNet, which seeks the flexibility and diversity of an ensemble using a single architecture.
arXiv Detail & Related papers (2022-02-18T09:35:51Z)
Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models. In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints. A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z)
Variational Refinement for Importance Sampling Using the Forward Kullback-Leibler Divergence [77.06203118175335]
Variational Inference (VI) is a popular alternative to exact sampling in Bayesian inference. Importance sampling (IS) is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures. We propose a novel combination of optimization and sampling techniques for approximate Bayesian inference.
arXiv Detail & Related papers (2021-06-30T11:00:24Z)
Learning Prediction Intervals for Regression: Generalization and Calibration [12.576284277353606]
We study the generation of prediction intervals in regression for uncertainty quantification. We use a general learning theory to characterize the optimality-feasibility tradeoff that encompasses Lipschitz continuity and VC-subgraph classes. We empirically demonstrate the strengths of our interval generation and calibration algorithms in terms of testing performances compared to existing benchmarks.
arXiv Detail & Related papers (2021-02-26T17:55:30Z)
Evaluating probabilistic classifiers: Reliability diagrams and score decompositions revisited [68.8204255655161]
We introduce the CORP approach, which generates provably statistically Consistent, Optimally binned, and Reproducible reliability diagrams in an automated way. Corpor is based on non-parametric isotonic regression and implemented via the Pool-adjacent-violators (PAV) algorithm.
arXiv Detail & Related papers (2020-08-07T08:22:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.