Convergence of uncertainty estimates in Ensemble and Bayesian sparse
model discovery
- URL: http://arxiv.org/abs/2301.12649v2
- Date: Wed, 26 Apr 2023 19:30:22 GMT
- Title: Convergence of uncertainty estimates in Ensemble and Bayesian sparse
model discovery
- Authors: L. Mars Gao, Urban Fasel, Steven L. Brunton, J. Nathan Kutz
- Abstract summary: We show empirical success in terms of accuracy and robustness to noise with bootstrapping-based sequential thresholding least-squares estimator.
We show that this bootstrapping-based ensembling technique can perform a provably correct variable selection procedure with an exponential convergence rate of the error rate.
- Score: 4.446017969073817
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sparse model identification enables nonlinear dynamical system discovery from
data. However, the control of false discoveries for sparse model identification
is challenging, especially in the low-data and high-noise limit. In this paper,
we perform a theoretical study on ensemble sparse model discovery, which shows
empirical success in terms of accuracy and robustness to noise. In particular,
we analyse the bootstrapping-based sequential thresholding least-squares
estimator. We show that this bootstrapping-based ensembling technique can
perform a provably correct variable selection procedure with an exponential
convergence rate of the error rate. In addition, we show that the ensemble
sparse model discovery method can perform computationally efficient uncertainty
estimation, compared to expensive Bayesian uncertainty quantification methods
via MCMC. We demonstrate the convergence properties and connection to
uncertainty quantification in various numerical studies on synthetic sparse
linear regression and sparse model discovery. The experiments on sparse linear
regression support that the bootstrapping-based sequential thresholding
least-squares method has better performance for sparse variable selection
compared to LASSO, thresholding least-squares, and bootstrapping-based LASSO.
In the sparse model discovery experiment, we show that the bootstrapping-based
sequential thresholding least-squares method can provide valid uncertainty
quantification, converging to a delta measure centered around the true value
with increased sample sizes. Finally, we highlight the improved robustness to
hyperparameter selection under shifting noise and sparsity levels of the
bootstrapping-based sequential thresholding least-squares method compared to
other sparse regression methods.
Related papers
- Bayesian Nonparametrics Meets Data-Driven Distributionally Robust Optimization [29.24821214671497]
Training machine learning and statistical models often involve optimizing a data-driven risk criterion.
We propose a novel robust criterion by combining insights from Bayesian nonparametric (i.e., Dirichlet process) theory and a recent decision-theoretic model of smooth ambiguity-averse preferences.
For practical implementation, we propose and study tractable approximations of the criterion based on well-known Dirichlet process representations.
arXiv Detail & Related papers (2024-01-28T21:19:15Z) - Fast Shapley Value Estimation: A Unified Approach [71.92014859992263]
We propose a straightforward and efficient Shapley estimator, SimSHAP, by eliminating redundant techniques.
In our analysis of existing approaches, we observe that estimators can be unified as a linear transformation of randomly summed values from feature subsets.
Our experiments validate the effectiveness of our SimSHAP, which significantly accelerates the computation of accurate Shapley values.
arXiv Detail & Related papers (2023-11-02T06:09:24Z) - Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point.
Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z) - Bayesian Pseudo-Coresets via Contrastive Divergence [5.479797073162603]
We introduce a novel approach for constructing pseudo-coresets by utilizing contrastive divergence.
It eliminates the need for approximations in the pseudo-coreset construction process.
We conduct extensive experiments on multiple datasets, demonstrating its superiority over existing BPC techniques.
arXiv Detail & Related papers (2023-03-20T17:13:50Z) - The Implicit Delta Method [61.36121543728134]
In this paper, we propose an alternative, the implicit delta method, which works by infinitesimally regularizing the training loss of uncertainty.
We show that the change in the evaluation due to regularization is consistent for the variance of the evaluation estimator, even when the infinitesimal change is approximated by a finite difference.
arXiv Detail & Related papers (2022-11-11T19:34:17Z) - Theoretical characterization of uncertainty in high-dimensional linear
classification [24.073221004661427]
We show that uncertainty for learning from limited number of samples of high-dimensional input data and labels can be obtained by the approximate message passing algorithm.
We discuss how over-confidence can be mitigated by appropriately regularising, and show that cross-validating with respect to the loss leads to better calibration than with the 0/1 error.
arXiv Detail & Related papers (2022-02-07T15:32:07Z) - Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks [78.76880041670904]
In neural networks with binary activations and or binary weights the training by gradient descent is complicated.
We propose a new method for this estimation problem combining sampling and analytic approximation steps.
We experimentally show higher accuracy in gradient estimation and demonstrate a more stable and better performing training in deep convolutional models.
arXiv Detail & Related papers (2020-06-04T21:51:21Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z) - Efficient Ensemble Model Generation for Uncertainty Estimation with
Bayesian Approximation in Segmentation [74.06904875527556]
We propose a generic and efficient segmentation framework to construct ensemble segmentation models.
In the proposed method, ensemble models can be efficiently generated by using the layer selection method.
We also devise a new pixel-wise uncertainty loss, which improves the predictive performance.
arXiv Detail & Related papers (2020-05-21T16:08:38Z) - Bayesian System ID: Optimal management of parameter, model, and
measurement uncertainty [0.0]
We evaluate the robustness of a probabilistic formulation of system identification (ID) to sparse, noisy, and indirect data.
We show that the log posterior has improved geometric properties compared with the objective function surfaces of traditional methods.
arXiv Detail & Related papers (2020-03-04T22:48:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.