Is Algorithmic Stability Testable? A Unified Framework under Computational Constraints
- URL: http://arxiv.org/abs/2405.15107v1
- Date: Thu, 23 May 2024 23:51:25 GMT
- Title: Is Algorithmic Stability Testable? A Unified Framework under Computational Constraints
- Authors: Yuetian Luo, Rina Foygel Barber,
- Abstract summary: We show that testing the stability of a black-box algorithm is impossible in settings where the data lies in an uncountably infinite space.
We develop a unified framework for quantifying the hardness of testing algorithmic stability, which establishes that across all settings, if the available data is limited then exhaustive search is essentially the only universally valid mechanism for certifying algorithmic stability.
- Score: 5.68594452139461
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Algorithmic stability is a central notion in learning theory that quantifies the sensitivity of an algorithm to small changes in the training data. If a learning algorithm satisfies certain stability properties, this leads to many important downstream implications, such as generalization, robustness, and reliable predictive inference. Verifying that stability holds for a particular algorithm is therefore an important and practical question. However, recent results establish that testing the stability of a black-box algorithm is impossible, given limited data from an unknown distribution, in settings where the data lies in an uncountably infinite space (such as real-valued data). In this work, we extend this question to examine a far broader range of settings, where the data may lie in any space -- for example, categorical data. We develop a unified framework for quantifying the hardness of testing algorithmic stability, which establishes that across all settings, if the available data is limited then exhaustive search is essentially the only universally valid mechanism for certifying algorithmic stability. Since in practice, any test of stability would naturally be subject to computational constraints, exhaustive search is impossible and so this implies fundamental limits on our ability to test the stability property for a black-box algorithm.
Related papers
- Stability is Stable: Connections between Replicability, Privacy, and
Adaptive Generalization [26.4468964378511]
A replicable algorithm gives the same output with high probability when its randomness is fixed.
Using replicable algorithms for data analysis can facilitate the verification of published results.
We establish new connections and separations between replicability and standard notions of algorithmic stability.
arXiv Detail & Related papers (2023-03-22T21:35:50Z) - Bagging Provides Assumption-free Stability [11.456416081243654]
Bagging is an important technique for stabilizing machine learning models.
In this paper, we derive a finite-sample guarantee on the stability of bagging for any model.
arXiv Detail & Related papers (2023-01-30T01:18:05Z) - Fully Stochastic Trust-Region Sequential Quadratic Programming for
Equality-Constrained Optimization Problems [62.83783246648714]
We propose a sequential quadratic programming algorithm (TR-StoSQP) to solve nonlinear optimization problems with objectives and deterministic equality constraints.
The algorithm adaptively selects the trust-region radius and, compared to the existing line-search StoSQP schemes, allows us to utilize indefinite Hessian matrices.
arXiv Detail & Related papers (2022-11-29T05:52:17Z) - Provably Auditing Ordinary Least Squares in Low Dimensions [17.655785504931913]
Most metrics measure stability of conclusions derived from Ordinary Least Squares linear regression.
Recent work proposes a simple, global, finite-sample stability metric: the minimum number of samples that need to be removed so that rerunning the analysis overturns the conclusion.
We show that in the low-dimensional regime where the number of co variables is a constant but the number of samples is large, there are efficient algorithms for provably estimating this metric.
arXiv Detail & Related papers (2022-05-28T00:45:10Z) - Black box tests for algorithmic stability [10.495496415022064]
Knowing an algorithm's stability properties is often useful for many downstream applications.
Many modern algorithms currently used in practice are too complex for a theoretical analysis of their stability properties.
arXiv Detail & Related papers (2021-11-30T16:36:58Z) - Double Coverage with Machine-Learned Advice [100.23487145400833]
We study the fundamental online $k$-server problem in a learning-augmented setting.
We show that our algorithm achieves for any k an almost optimal consistency-robustness tradeoff.
arXiv Detail & Related papers (2021-03-02T11:04:33Z) - Privacy Preserving Recalibration under Domain Shift [119.21243107946555]
We introduce a framework that abstracts out the properties of recalibration problems under differential privacy constraints.
We also design a novel recalibration algorithm, accuracy temperature scaling, that outperforms prior work on private datasets.
arXiv Detail & Related papers (2020-08-21T18:43:37Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - Fine-Grained Analysis of Stability and Generalization for Stochastic
Gradient Descent [55.85456985750134]
We introduce a new stability measure called on-average model stability, for which we develop novel bounds controlled by the risks of SGD iterates.
This yields generalization bounds depending on the behavior of the best model, and leads to the first-ever-known fast bounds in the low-noise setting.
To our best knowledge, this gives the firstever-known stability and generalization for SGD with even non-differentiable loss functions.
arXiv Detail & Related papers (2020-06-15T06:30:19Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z) - Interpretable Random Forests via Rule Extraction [0.0]
We introduce SIRUS, a stable rule learning algorithm which takes the form of a short and simple list of rules.
Our R/C++ software implementation sirus is available from CRAN.
arXiv Detail & Related papers (2020-04-29T08:13:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.