Probabilistic Stability Guarantees for Feature Attributions
- URL: http://arxiv.org/abs/2504.13787v1
- Date: Fri, 18 Apr 2025 16:39:08 GMT
- Title: Probabilistic Stability Guarantees for Feature Attributions
- Authors: Helen Jin, Anton Xue, Weiqiu You, Surbhi Goel, Eric Wong,
- Abstract summary: We propose a simple, model-agnostic, and sample-efficient stability certification algorithm (SCA) that provides non-trivial and interpretable guarantees for any attribution.<n>We show that mild smoothing enables a graceful tradeoff between accuracy and stability, in contrast to prior certification methods that require a more aggressive compromise.
- Score: 20.58023369482214
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Stability guarantees are an emerging tool for evaluating feature attributions, but existing certification methods rely on smoothed classifiers and often yield conservative guarantees. To address these limitations, we introduce soft stability and propose a simple, model-agnostic, and sample-efficient stability certification algorithm (SCA) that provides non-trivial and interpretable guarantees for any attribution. Moreover, we show that mild smoothing enables a graceful tradeoff between accuracy and stability, in contrast to prior certification methods that require a more aggressive compromise. Using Boolean function analysis, we give a novel characterization of stability under smoothing. We evaluate SCA on vision and language tasks, and demonstrate the effectiveness of soft stability in measuring the robustness of explanation methods.
Related papers
- Confidence-aware Denoised Fine-tuning of Off-the-shelf Models for Certified Robustness [56.2479170374811]
We introduce Fine-Tuning with Confidence-Aware Denoised Image Selection (FT-CADIS)
FT-CADIS is inspired by the observation that the confidence of off-the-shelf classifiers can effectively identify hallucinated images during denoised smoothing.
It has established the state-of-the-art certified robustness among denoised smoothing methods across all $ell$-adversary radius in various benchmarks.
arXiv Detail & Related papers (2024-11-13T09:13:20Z) - Stability Evaluation via Distributional Perturbation Analysis [28.379994938809133]
We propose a stability evaluation criterion based on distributional perturbations.
Our stability evaluation criterion can address both emphdata corruptions and emphsub-population shifts.
Empirically, we validate the practical utility of our stability evaluation criterion across a host of real-world applications.
arXiv Detail & Related papers (2024-05-06T06:47:14Z) - Distributionally Robust Policy and Lyapunov-Certificate Learning [13.38077406934971]
Key challenge in designing controllers with stability guarantees for uncertain systems is the accurate determination of and adaptation to shifts in model parametric uncertainty during online deployment.
We tackle this with a novel distributionally robust formulation of the Lyapunov derivative chance constraint ensuring a monotonic decrease of the Lyapunov certificate.
We show that, for the resulting closed-loop system, the global stability of its equilibrium can be certified with high confidence, even with Out-of-Distribution uncertainties.
arXiv Detail & Related papers (2024-04-03T18:57:54Z) - Stability Guarantees for Feature Attributions with Multiplicative
Smoothing [11.675168649032875]
We analyze stability as a property for reliable feature attribution methods.
We develop a smoothing method called Multiplicative Smoothing (MuS) to achieve such a model.
We evaluate MuS on vision and language models with various feature attribution methods, such as LIME and SHAP, and demonstrate that MuS endows feature attributions with non-trivial stability guarantees.
arXiv Detail & Related papers (2023-07-12T04:19:47Z) - Bagging Provides Assumption-free Stability [11.456416081243654]
Bagging is an important technique for stabilizing machine learning models.
In this paper, we derive a finite-sample guarantee on the stability of bagging for any model.
arXiv Detail & Related papers (2023-01-30T01:18:05Z) - A Policy Optimization Method Towards Optimal-time Stability [15.722871779526526]
We propose a policy optimization technique incorporating sampling-based Lyapunov stability.
Our approach enables the system's state to reach an equilibrium point within an optimal time.
arXiv Detail & Related papers (2023-01-02T04:19:56Z) - Minimax Optimal Estimation of Stability Under Distribution Shift [8.893526921869137]
We analyze the stability of a system under distribution shift.
The stability measure is defined in terms of a more intuitive quantity: the level of acceptable performance degradation.
Our characterization of the minimax convergence rate shows that evaluating stability against large performance degradation incurs a statistical cost.
arXiv Detail & Related papers (2022-12-13T02:40:30Z) - KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed
Stability in Nonlinear Dynamical Systems [66.9461097311667]
We propose a model-based reinforcement learning framework with formal stability guarantees.
The proposed method learns the system dynamics up to a confidence interval using feature representation.
We show that KCRL is guaranteed to learn a stabilizing policy in a finite number of interactions with the underlying unknown system.
arXiv Detail & Related papers (2022-06-03T17:27:04Z) - Joint Differentiable Optimization and Verification for Certified
Reinforcement Learning [91.93635157885055]
In model-based reinforcement learning for safety-critical control systems, it is important to formally certify system properties.
We propose a framework that jointly conducts reinforcement learning and formal verification.
arXiv Detail & Related papers (2022-01-28T16:53:56Z) - Probabilistic robust linear quadratic regulators with Gaussian processes [73.0364959221845]
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design.
We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin.
arXiv Detail & Related papers (2021-05-17T08:36:18Z) - Efficient Empowerment Estimation for Unsupervised Stabilization [75.32013242448151]
empowerment principle enables unsupervised stabilization of dynamical systems at upright positions.
We propose an alternative solution based on a trainable representation of a dynamical system as a Gaussian channel.
We show that our method has a lower sample complexity, is more stable in training, possesses the essential properties of the empowerment function, and allows estimation of empowerment from images.
arXiv Detail & Related papers (2020-07-14T21:10:16Z) - Fine-Grained Analysis of Stability and Generalization for Stochastic
Gradient Descent [55.85456985750134]
We introduce a new stability measure called on-average model stability, for which we develop novel bounds controlled by the risks of SGD iterates.
This yields generalization bounds depending on the behavior of the best model, and leads to the first-ever-known fast bounds in the low-noise setting.
To our best knowledge, this gives the firstever-known stability and generalization for SGD with even non-differentiable loss functions.
arXiv Detail & Related papers (2020-06-15T06:30:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.