Convergence of uncertainty estimates in Ensemble and Bayesian sparse
model discovery
- URL: http://arxiv.org/abs/2301.12649v2
- Date: Wed, 26 Apr 2023 19:30:22 GMT
- Title: Convergence of uncertainty estimates in Ensemble and Bayesian sparse
model discovery
- Authors: L. Mars Gao, Urban Fasel, Steven L. Brunton, J. Nathan Kutz
- Abstract summary: We show empirical success in terms of accuracy and robustness to noise with bootstrapping-based sequential thresholding least-squares estimator.
We show that this bootstrapping-based ensembling technique can perform a provably correct variable selection procedure with an exponential convergence rate of the error rate.
- Score: 4.446017969073817
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sparse model identification enables nonlinear dynamical system discovery from
data. However, the control of false discoveries for sparse model identification
is challenging, especially in the low-data and high-noise limit. In this paper,
we perform a theoretical study on ensemble sparse model discovery, which shows
empirical success in terms of accuracy and robustness to noise. In particular,
we analyse the bootstrapping-based sequential thresholding least-squares
estimator. We show that this bootstrapping-based ensembling technique can
perform a provably correct variable selection procedure with an exponential
convergence rate of the error rate. In addition, we show that the ensemble
sparse model discovery method can perform computationally efficient uncertainty
estimation, compared to expensive Bayesian uncertainty quantification methods
via MCMC. We demonstrate the convergence properties and connection to
uncertainty quantification in various numerical studies on synthetic sparse
linear regression and sparse model discovery. The experiments on sparse linear
regression support that the bootstrapping-based sequential thresholding
least-squares method has better performance for sparse variable selection
compared to LASSO, thresholding least-squares, and bootstrapping-based LASSO.
In the sparse model discovery experiment, we show that the bootstrapping-based
sequential thresholding least-squares method can provide valid uncertainty
quantification, converging to a delta measure centered around the true value
with increased sample sizes. Finally, we highlight the improved robustness to
hyperparameter selection under shifting noise and sparsity levels of the
bootstrapping-based sequential thresholding least-squares method compared to
other sparse regression methods.
Related papers
- Embedded Nonlocal Operator Regression (ENOR): Quantifying model error in learning nonlocal operators [8.585650361148558]
We propose a new framework to learn a nonlocal homogenized surrogate model and its structural model error.
This framework provides discrepancy-adaptive uncertainty quantification for homogenized material response predictions in long-term simulations.
arXiv Detail & Related papers (2024-10-27T04:17:27Z) - A sparse PAC-Bayesian approach for high-dimensional quantile prediction [0.0]
This paper presents a novel probabilistic machine learning approach for high-dimensional quantile prediction.
It uses a pseudo-Bayesian framework with a scaled Student-t prior and Langevin Monte Carlo for efficient computation.
Its effectiveness is validated through simulations and real-world data, where it performs competitively against established frequentist and Bayesian techniques.
arXiv Detail & Related papers (2024-09-03T08:01:01Z) - Fast Shapley Value Estimation: A Unified Approach [71.92014859992263]
We propose a straightforward and efficient Shapley estimator, SimSHAP, by eliminating redundant techniques.
In our analysis of existing approaches, we observe that estimators can be unified as a linear transformation of randomly summed values from feature subsets.
Our experiments validate the effectiveness of our SimSHAP, which significantly accelerates the computation of accurate Shapley values.
arXiv Detail & Related papers (2023-11-02T06:09:24Z) - Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point.
Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z) - The Implicit Delta Method [61.36121543728134]
In this paper, we propose an alternative, the implicit delta method, which works by infinitesimally regularizing the training loss of uncertainty.
We show that the change in the evaluation due to regularization is consistent for the variance of the evaluation estimator, even when the infinitesimal change is approximated by a finite difference.
arXiv Detail & Related papers (2022-11-11T19:34:17Z) - Theoretical characterization of uncertainty in high-dimensional linear
classification [24.073221004661427]
We show that uncertainty for learning from limited number of samples of high-dimensional input data and labels can be obtained by the approximate message passing algorithm.
We discuss how over-confidence can be mitigated by appropriately regularising, and show that cross-validating with respect to the loss leads to better calibration than with the 0/1 error.
arXiv Detail & Related papers (2022-02-07T15:32:07Z) - Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks [78.76880041670904]
In neural networks with binary activations and or binary weights the training by gradient descent is complicated.
We propose a new method for this estimation problem combining sampling and analytic approximation steps.
We experimentally show higher accuracy in gradient estimation and demonstrate a more stable and better performing training in deep convolutional models.
arXiv Detail & Related papers (2020-06-04T21:51:21Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z) - Efficient Ensemble Model Generation for Uncertainty Estimation with
Bayesian Approximation in Segmentation [74.06904875527556]
We propose a generic and efficient segmentation framework to construct ensemble segmentation models.
In the proposed method, ensemble models can be efficiently generated by using the layer selection method.
We also devise a new pixel-wise uncertainty loss, which improves the predictive performance.
arXiv Detail & Related papers (2020-05-21T16:08:38Z) - Bayesian System ID: Optimal management of parameter, model, and
measurement uncertainty [0.0]
We evaluate the robustness of a probabilistic formulation of system identification (ID) to sparse, noisy, and indirect data.
We show that the log posterior has improved geometric properties compared with the objective function surfaces of traditional methods.
arXiv Detail & Related papers (2020-03-04T22:48:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.