Related papers: Worst-Case Convergence Time of ML Algorithms via Extreme Value Theory

Worst-Case Convergence Time of ML Algorithms via Extreme Value Theory

URL: http://arxiv.org/abs/2404.07170v1
Date: Wed, 10 Apr 2024 17:05:12 GMT
Title: Worst-Case Convergence Time of ML Algorithms via Extreme Value Theory
Authors: Saeid Tizpaz-Niari, Sriram Sankaranarayanan,
Abstract summary: This paper leverages the statistics of extreme values to predict the worst-case convergence times of machine learning algorithms. Timing is a critical non-functional property of ML systems, and providing the worst-case converge times is essential to guarantee the availability of ML and its services.
Score: 8.540426791244533
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper leverages the statistics of extreme values to predict the worst-case convergence times of machine learning algorithms. Timing is a critical non-functional property of ML systems, and providing the worst-case converge times is essential to guarantee the availability of ML and its services. However, timing properties such as worst-case convergence times (WCCT) are difficult to verify since (1) they are not encoded in the syntax or semantics of underlying programming languages of AI, (2) their evaluations depend on both algorithmic implementations and underlying systems, and (3) their measurements involve uncertainty and noise. Therefore, prevalent formal methods and statistical models fail to provide rich information on the amounts and likelihood of WCCT. Our key observation is that the timing information we seek represents the extreme tail of execution times. Therefore, extreme value theory (EVT), a statistical discipline that focuses on understanding and predicting the distribution of extreme values in the tail of outcomes, provides an ideal framework to model and analyze WCCT in the training and inference phases of ML paradigm. Building upon the mathematical tools from EVT, we propose a practical framework to predict the worst-case timing properties of ML. Over a set of linear ML training algorithms, we show that EVT achieves a better accuracy for predicting WCCTs than relevant statistical methods such as the Bayesian factor. On the set of larger machine learning training algorithms and deep neural network inference, we show the feasibility and usefulness of EVT models to accurately predict WCCTs, their expected return periods, and their likelihood.

Related papers

Unveil Sources of Uncertainty: Feature Contribution to Conformal Prediction Intervals [0.3495246564946556]
We propose a novel, model-agnostic uncertainty attribution (UA) method grounded in conformal prediction (CP)<n>We define cooperative games where CP interval properties-such as width and bounds-serve as value functions, we attribute predictive uncertainty to input features.<n>Our experiments on synthetic benchmarks and real-world datasets demonstrate the practical utility and interpretative depth of our approach.
arXiv Detail & Related papers (2025-05-19T13:49:05Z)
Prediction Models That Learn to Avoid Missing Values [7.302408149992981]
Missingness-avoiding (MA) machine learning is a framework for training models to rarely require the values of missing features at test time.<n>We create tailored MA learning algorithms for decision trees, tree ensembles, and sparse linear models.<n>We show that our framework gives practitioners a powerful tool to maintain interpretability in predictions with test-time missing values.
arXiv Detail & Related papers (2025-05-06T10:16:35Z)
New Statistical Framework for Extreme Error Probability in High-Stakes Domains for Reliable Machine Learning [4.14360329494344]
Extreme Value Theory (EVT) is a statistical framework that provides a rigorous approach to estimating worst-case failures. Applying EVT to synthetic and real-world datasets, this method is shown to enable robust estimation of catastrophic failure probabilities. This work establishes EVT as a fundamental tool for assessing model reliability, ensuring safer AI deployment in new technologies.
arXiv Detail & Related papers (2025-03-31T16:08:11Z)
Uncertainty quantification for Markov chains with application to temporal difference learning [63.49764856675643]
We develop novel high-dimensional concentration inequalities and Berry-Esseen bounds for vector- and matrix-valued functions of Markov chains. We analyze the TD learning algorithm, a widely used method for policy evaluation in reinforcement learning.
arXiv Detail & Related papers (2025-02-19T15:33:55Z)
Probabilistic Iterative Hard Thresholding for Sparse Learning [2.5782973781085383]
We present an approach towards solving expectation objective optimization problems with cardinality constraints. We prove convergence of the underlying process, and demonstrate the performance on two Machine Learning problems.
arXiv Detail & Related papers (2024-09-02T18:14:45Z)
Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference. Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z)
Dynamic Model Agnostic Reliability Evaluation of Machine-Learning Methods Integrated in Instrumentation & Control Systems [1.8978726202765634]
Trustworthiness of datadriven neural network-based machine learning algorithms is not adequately assessed. In recent reports by the National Institute for Standards and Technology, trustworthiness in ML is a critical barrier to adoption. We demonstrate a real-time model-agnostic method to evaluate the relative reliability of ML predictions by incorporating out-of-distribution detection on the training dataset.
arXiv Detail & Related papers (2023-08-08T18:25:42Z)
Fairness Uncertainty Quantification: How certain are you that the model is fair? [13.209748908186606]
In modern machine learning, Gradient Descent (SGD) type algorithms are almost always used as training algorithms implying that the learned model, and consequently, its fairness properties are random. In this work we provide Confidence Interval (CI) for test unfairness when a group-fairness-aware, specifically, Disparate Impact (DI), and Disparate Mistreatment (DM) aware linear binary classifier is trained using online SGD-type algorithms.
arXiv Detail & Related papers (2023-04-27T04:07:58Z)
Self-learning locally-optimal hypertuning using maximum entropy, and comparison of machine learning approaches for estimating fatigue life in composite materials [0.0]
We develop an ML nearest-neighbors-alike algorithm based on the principle of maximum entropy to predict fatigue damage. The predictions achieve a good level of accuracy, similar to other ML algorithms.
arXiv Detail & Related papers (2022-10-19T12:20:07Z)
Bias-Variance Tradeoffs in Single-Sample Binary Gradient Estimators [100.58924375509659]
Straight-through (ST) estimator gained popularity due to its simplicity and efficiency. Several techniques were proposed to improve over ST while keeping the same low computational complexity. We conduct a theoretical analysis of Bias and Variance of these methods in order to understand tradeoffs and verify originally claimed properties.
arXiv Detail & Related papers (2021-10-07T15:16:07Z)
Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues. We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders. We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z)
Amortized Conditional Normalized Maximum Likelihood: Reliable Out of Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation. Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle. We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z)
A comparison of Monte Carlo dropout and bootstrap aggregation on the performance and uncertainty estimation in radiation therapy dose prediction with deep learning neural networks [0.46180371154032895]
We propose to use Monte Carlo dropout (MCDO) and the bootstrap aggregation (bagging) technique on deep learning models to produce uncertainty estimations for radiation therapy dose prediction. Performance-wise, bagging provides statistically significant reduced loss value and errors in most of the metrics investigated.
arXiv Detail & Related papers (2020-11-01T00:24:43Z)
Machine learning for causal inference: on the use of cross-fit estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties. We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE) When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z)
Localized Debiased Machine Learning: Efficient Inference on Quantile Treatment Effects and Beyond [69.83813153444115]
We consider an efficient estimating equation for the (local) quantile treatment effect ((L)QTE) in causal inference. Debiased machine learning (DML) is a data-splitting approach to estimating high-dimensional nuisances. We propose localized debiased machine learning (LDML), which avoids this burdensome step.
arXiv Detail & Related papers (2019-12-30T14:42:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.