Related papers: Stability Evaluation via Distributional Perturbation Analysis

Stability Evaluation via Distributional Perturbation Analysis

URL: http://arxiv.org/abs/2405.03198v1
Date: Mon, 6 May 2024 06:47:14 GMT
Title: Stability Evaluation via Distributional Perturbation Analysis
Authors: Jose Blanchet, Peng Cui, Jiajin Li, Jiashuo Liu,
Abstract summary: We propose a stability evaluation criterion based on distributional perturbations. Our stability evaluation criterion can address both emphdata corruptions and emphsub-population shifts. Empirically, we validate the practical utility of our stability evaluation criterion across a host of real-world applications.
Score: 28.379994938809133
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The performance of learning models often deteriorates when deployed in out-of-sample environments. To ensure reliable deployment, we propose a stability evaluation criterion based on distributional perturbations. Conceptually, our stability evaluation criterion is defined as the minimal perturbation required on our observed dataset to induce a prescribed deterioration in risk evaluation. In this paper, we utilize the optimal transport (OT) discrepancy with moment constraints on the \textit{(sample, density)} space to quantify this perturbation. Therefore, our stability evaluation criterion can address both \emph{data corruptions} and \emph{sub-population shifts} -- the two most common types of distribution shifts in real-world scenarios. To further realize practical benefits, we present a series of tractable convex formulations and computational methods tailored to different classes of loss functions. The key technical tool to achieve this is the strong duality theorem provided in this paper. Empirically, we validate the practical utility of our stability evaluation criterion across a host of real-world applications. These empirical studies showcase the criterion's ability not only to compare the stability of different learning models and features but also to provide valuable guidelines and strategies to further improve models.

Related papers

Aurora: Are Android Malware Classifiers Reliable and Stable under Distribution Shift? [51.12297424766236]
AURORA is a framework to evaluate malware classifiers based on their confidence quality and operational resilience.<n>AURORA is complemented by a set of metrics designed to go beyond point-in-time performance.<n>The fragility in SOTA frameworks across datasets of varying drift suggests the need for a return to the whiteboard.
arXiv Detail & Related papers (2025-05-28T20:22:43Z)
SurvUnc: A Meta-Model Based Uncertainty Quantification Framework for Survival Analysis [8.413107141283502]
Survival analysis is fundamental in numerous real-world applications, particularly in high-stakes domains such as healthcare and risk assessment.<n>Despite advances in numerous survival models, quantifying the uncertainty of predictions remains underexplored and challenging.<n>We introduce SurvUnc, a novel meta-model based framework for post-hoc uncertainty quantification for survival models.
arXiv Detail & Related papers (2025-05-20T18:12:20Z)
Balancing Stability and Plasticity in Pretrained Detector: A Dual-Path Framework for Incremental Object Detection [19.684132921720945]
The balance between stability and plasticity remains a fundamental challenge in pretrained model-based incremental object detection. We propose a dual-path framework built upon pretrained DETR-based detectors which decouples localization stability and classification plasticity. We show our method's ability to effectively balance stability and plasticity in PTMIOD, achieving robust cross-domain adaptation and strong retention of anti-forgetting capabilities.
arXiv Detail & Related papers (2025-04-14T13:31:35Z)
Towards Robust Stability Prediction in Smart Grids: GAN-based Approach under Data Constraints and Adversarial Challenges [53.2306792009435]
This paper introduces a novel framework for detecting instability in smart grids using only stable data.<n>It achieves up to 98.1% accuracy in predicting grid stability and 98.9% in detecting adversarial attacks.<n>Implemented on a single-board computer, it enables real-time decision-making with an average response time of under 7ms.
arXiv Detail & Related papers (2025-01-27T20:48:25Z)
On the Selection Stability of Stability Selection and Its Applications [2.263635133348731]
This paper seeks to broaden the use of an established stability estimator to evaluate the overall stability of the stability selection framework. We suggest that the stability estimator offers two advantages: it can serve as a reference to reflect the robustness of the outcomes obtained and help identify an optimal regularization value to improve stability.
arXiv Detail & Related papers (2024-11-14T00:02:54Z)
Statistical Inference for Temporal Difference Learning with Linear Function Approximation [62.69448336714418]
Temporal Difference (TD) learning, arguably the most widely used for policy evaluation, serves as a natural framework for this purpose. In this paper, we study the consistency properties of TD learning with Polyak-Ruppert averaging and linear function approximation, and obtain three significant improvements over existing results.
arXiv Detail & Related papers (2024-10-21T15:34:44Z)
Bayesian Nonparametrics Meets Data-Driven Distributionally Robust Optimization [29.24821214671497]
Training machine learning and statistical models often involve optimizing a data-driven risk criterion. We propose a novel robust criterion by combining insights from Bayesian nonparametric (i.e., Dirichlet process) theory and a recent decision-theoretic model of smooth ambiguity-averse preferences. For practical implementation, we propose and study tractable approximations of the criterion based on well-known Dirichlet process representations.
arXiv Detail & Related papers (2024-01-28T21:19:15Z)
The Risk of Federated Learning to Skew Fine-Tuning Features and Underperform Out-of-Distribution Robustness [50.52507648690234]
Federated learning has the risk of skewing fine-tuning features and compromising the robustness of the model. We introduce three robustness indicators and conduct experiments across diverse robust datasets. Our approach markedly enhances the robustness across diverse scenarios, encompassing various parameter-efficient fine-tuning methods.
arXiv Detail & Related papers (2024-01-25T09:18:51Z)
Towards stable real-world equation discovery with assessing differentiating quality influence [52.2980614912553]
We propose alternatives to the commonly used finite differences-based method. We evaluate these methods in terms of applicability to problems, similar to the real ones, and their ability to ensure the convergence of equation discovery algorithms.
arXiv Detail & Related papers (2023-11-09T23:32:06Z)
A Stability Principle for Learning under Non-Stationarity [1.1510009152620668]
We develop a versatile framework for statistical learning in non-stationary environments.<n>We prove regret bounds that are minimax optimal up to logarithmic factors when the population losses are strongly convex, or Lipschitz only.<n>We evaluate the practical performance of our approach through real-data experiments on electricity demand prediction and hospital nurse staffing.
arXiv Detail & Related papers (2023-10-27T17:53:53Z)
Minimax Optimal Estimation of Stability Under Distribution Shift [8.893526921869137]
We analyze the stability of a system under distribution shift. The stability measure is defined in terms of a more intuitive quantity: the level of acceptable performance degradation. Our characterization of the minimax convergence rate shows that evaluating stability against large performance degradation incurs a statistical cost.
arXiv Detail & Related papers (2022-12-13T02:40:30Z)
Continual evaluation for lifelong learning: Identifying the stability gap [35.99653845083381]
We show that a set of common state-of-the-art methods still suffers from substantial forgetting upon starting to learn new tasks. We refer to this intriguing but potentially problematic phenomenon as the stability gap. We establish a framework for continual evaluation that uses per-iteration evaluation and we define a new set of metrics to quantify worst-case performance.
arXiv Detail & Related papers (2022-05-26T15:56:08Z)
Versatile and Robust Transient Stability Assessment via Instance Transfer Learning [6.760999627905228]
This paper introduces a new data collection method in a data-driven algorithm incorporating the knowledge of power system dynamics. We introduce a new concept called Fault-Affected Area, which provides crucial information regarding the unstable region of operation. The test results on the IEEE 39-bus system verify that this model can accurately predict the stability of previously unseen operational scenarios.
arXiv Detail & Related papers (2021-02-20T09:10:29Z)
Efficient Empowerment Estimation for Unsupervised Stabilization [75.32013242448151]
empowerment principle enables unsupervised stabilization of dynamical systems at upright positions. We propose an alternative solution based on a trainable representation of a dynamical system as a Gaussian channel. We show that our method has a lower sample complexity, is more stable in training, possesses the essential properties of the empowerment function, and allows estimation of empowerment from images.
arXiv Detail & Related papers (2020-07-14T21:10:16Z)
Fine-Grained Analysis of Stability and Generalization for Stochastic Gradient Descent [55.85456985750134]
We introduce a new stability measure called on-average model stability, for which we develop novel bounds controlled by the risks of SGD iterates. This yields generalization bounds depending on the behavior of the best model, and leads to the first-ever-known fast bounds in the low-noise setting. To our best knowledge, this gives the firstever-known stability and generalization for SGD with even non-differentiable loss functions.
arXiv Detail & Related papers (2020-06-15T06:30:19Z)
Stable Adversarial Learning under Distributional Shifts [46.98655899839784]
Machine learning algorithms with empirical risk minimization are vulnerable under distributional shifts. We propose Stable Adversarial Learning (SAL) algorithm that leverages heterogeneous data sources to construct a more practical uncertainty set.
arXiv Detail & Related papers (2020-06-08T08:42:34Z)
GenDICE: Generalized Offline Estimation of Stationary Values [108.17309783125398]
We show that effective estimation can still be achieved in important applications. Our approach is based on estimating a ratio that corrects for the discrepancy between the stationary and empirical distributions. The resulting algorithm, GenDICE, is straightforward and effective.
arXiv Detail & Related papers (2020-02-21T00:27:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.