Related papers: Beyond Robustness: Resilience Verification of Tree-Based Classifiers

Beyond Robustness: Resilience Verification of Tree-Based Classifiers

URL: http://arxiv.org/abs/2112.02705v1
Date: Sun, 5 Dec 2021 23:07:22 GMT
Title: Beyond Robustness: Resilience Verification of Tree-Based Classifiers
Authors: Stefano Calzavara, Lorenzo Cazzaro, Claudio Lucchese, Federico Marcuzzi, Salvatore Orlando
Abstract summary: We introduce a new measure called resilience and we focus on its verification. We discuss how resilience can be verified by combining a traditional robustness verification technique with a data-independent stability analysis. Our results show that resilience verification is useful and feasible in practice, yielding a more reliable security assessment of both standard and robust decision tree models.
Score: 7.574509994822738
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper we criticize the robustness measure traditionally employed to assess the performance of machine learning models deployed in adversarial settings. To mitigate the limitations of robustness, we introduce a new measure called resilience and we focus on its verification. In particular, we discuss how resilience can be verified by combining a traditional robustness verification technique with a data-independent stability analysis, which identifies a subset of the feature space where the model does not change its predictions despite adversarial manipulations. We then introduce a formally sound data-independent stability analysis for decision trees and decision tree ensembles, which we experimentally assess on public datasets and we leverage for resilience verification. Our results show that resilience verification is useful and feasible in practice, yielding a more reliable security assessment of both standard and robust decision tree models.

Related papers

Evaluating the Evaluators: Trust in Adversarial Robustness Tests [17.06660302788049]
AttackBench is an evaluation tool that ranks existing attack implementations based on a novel optimality metric.<n>The framework enforces consistent testing conditions and enables continuous updates, making it a reliable foundation for robustness verification.
arXiv Detail & Related papers (2025-07-04T10:07:26Z)
Temporalizing Confidence: Evaluation of Chain-of-Thought Reasoning with Signal Temporal Logic [0.12499537119440243]
We propose a structured framework that models stepwise confidence as a temporal signal and evaluates it using Signal Temporal Logic (STL)<n>In particular, we define formal STL-based constraints to capture desirable temporal properties and compute scores that serve as structured, interpretable confidence estimates.<n>Our approach consistently improves calibration metrics and provides more reliable uncertainty estimates than conventional confidence aggregation and post-hoc calibration.
arXiv Detail & Related papers (2025-06-09T21:21:12Z)
Aurora: Are Android Malware Classifiers Reliable and Stable under Distribution Shift? [51.12297424766236]
AURORA is a framework to evaluate malware classifiers based on their confidence quality and operational resilience.<n>AURORA is complemented by a set of metrics designed to go beyond point-in-time performance.<n>The fragility in SOTA frameworks across datasets of varying drift suggests the need for a return to the whiteboard.
arXiv Detail & Related papers (2025-05-28T20:22:43Z)
SurvUnc: A Meta-Model Based Uncertainty Quantification Framework for Survival Analysis [8.413107141283502]
Survival analysis is fundamental in numerous real-world applications, particularly in high-stakes domains such as healthcare and risk assessment.<n>Despite advances in numerous survival models, quantifying the uncertainty of predictions remains underexplored and challenging.<n>We introduce SurvUnc, a novel meta-model based framework for post-hoc uncertainty quantification for survival models.
arXiv Detail & Related papers (2025-05-20T18:12:20Z)
Benchmarking the Spatial Robustness of DNNs via Natural and Adversarial Localized Corruptions [49.546479320670464]
This paper introduces specialized metrics for benchmarking the spatial robustness of segmentation models. We propose region-aware multi-attack adversarial analysis, a method that enables a deeper understanding of model robustness. The results reveal that models respond to these two types of threats differently.
arXiv Detail & Related papers (2025-04-02T11:37:39Z)
Probabilistic Modeling of Disparity Uncertainty for Robust and Efficient Stereo Matching [61.73532883992135]
We propose a new uncertainty-aware stereo matching framework. We adopt Bayes risk as the measurement of uncertainty and use it to separately estimate data and model uncertainty.
arXiv Detail & Related papers (2024-12-24T23:28:20Z)
Quantifying calibration error in modern neural networks through evidence based theory [0.0]
This paper introduces a novel framework for quantifying the trustworthiness of neural networks by incorporating subjective logic into the evaluation of Expected Error (ECE) We demonstrate the effectiveness of this approach through experiments on MNIST and CIFAR-10 datasets where post-calibration results indicate improved trustworthiness. The proposed framework offers a more interpretable and nuanced assessment of AI models, with potential applications in sensitive domains such as healthcare and autonomous systems.
arXiv Detail & Related papers (2024-10-31T23:54:21Z)
Rigorous Probabilistic Guarantees for Robust Counterfactual Explanations [80.86128012438834]
We show for the first time that computing the robustness of counterfactuals with respect to plausible model shifts is NP-complete. We propose a novel probabilistic approach which is able to provide tight estimates of robustness with strong guarantees.
arXiv Detail & Related papers (2024-07-10T09:13:11Z)
Stability Evaluation via Distributional Perturbation Analysis [28.379994938809133]
We propose a stability evaluation criterion based on distributional perturbations. Our stability evaluation criterion can address both emphdata corruptions and emphsub-population shifts. Empirically, we validate the practical utility of our stability evaluation criterion across a host of real-world applications.
arXiv Detail & Related papers (2024-05-06T06:47:14Z)
Stability-Certified Learning of Control Systems with Quadratic Nonlinearities [9.599029891108229]
This work primarily focuses on an operator inference methodology aimed at constructing low-dimensional dynamical models. Our main objective is to develop a method that facilitates the inference of quadratic control dynamical systems with inherent stability guarantees.
arXiv Detail & Related papers (2024-03-01T16:26:47Z)
From Static Benchmarks to Adaptive Testing: Psychometrics in AI Evaluation [60.14902811624433]
We discuss a paradigm shift from static evaluation methods to adaptive testing. This involves estimating the characteristics and value of each test item in the benchmark and dynamically adjusting items in real-time. We analyze the current approaches, advantages, and underlying reasons for adopting psychometrics in AI evaluation.
arXiv Detail & Related papers (2023-06-18T09:54:33Z)
From Adversarial Arms Race to Model-centric Evaluation: Motivating a Unified Automatic Robustness Evaluation Framework [91.94389491920309]
Textual adversarial attacks can discover models' weaknesses by adding semantic-preserved but misleading perturbations to the inputs. The existing practice of robustness evaluation may exhibit issues of incomprehensive evaluation, impractical evaluation protocol, and invalid adversarial samples. We set up a unified automatic robustness evaluation framework, shifting towards model-centric evaluation to exploit the advantages of adversarial attacks.
arXiv Detail & Related papers (2023-05-29T14:55:20Z)
Approaching Neural Network Uncertainty Realism [53.308409014122816]
Quantifying or at least upper-bounding uncertainties is vital for safety-critical systems such as autonomous vehicles. We evaluate uncertainty realism -- a strict quality criterion -- with a Mahalanobis distance-based statistical test. We adopt it to the automotive domain and show that it significantly improves uncertainty realism compared to a plain encoder-decoder model.
arXiv Detail & Related papers (2021-01-08T11:56:12Z)
Fair Training of Decision Tree Classifiers [6.381149074212897]
We study the problem of formally verifying individual fairness of decision tree ensembles. In our approach, fairness verification and fairness-aware training both rely on a notion of stability of a classification model.
arXiv Detail & Related papers (2021-01-04T12:04:22Z)
Trust but Verify: Assigning Prediction Credibility by Counterfactual Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning. These measures should account for the wide variety of models used in practice. The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z)
RobustBench: a standardized adversarial robustness benchmark [84.50044645539305]
Key challenge in benchmarking robustness is that its evaluation is often error-prone leading to robustness overestimation. We evaluate adversarial robustness with AutoAttack, an ensemble of white- and black-box attacks. We analyze the impact of robustness on the performance on distribution shifts, calibration, out-of-distribution detection, fairness, privacy leakage, smoothness, and transferability.
arXiv Detail & Related papers (2020-10-19T17:06:18Z)
Variational Encoder-based Reliable Classification [5.161531917413708]
We propose an Epistemic (EC) that can provide justification of its belief using support from the training dataset as well as quality of reconstruction. Our approach is based on modified variational autoencoders that can identify a semantically meaningful low-dimensional space. Our results demonstrate improved reliability of predictions and robust identification of samples with adversarial attacks.
arXiv Detail & Related papers (2020-02-19T17:05:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.