Related papers: Specifying and Testing $k$-Safety Properties for Machine-Learning Models

Specifying and Testing $k$-Safety Properties for Machine-Learning Models

URL: http://arxiv.org/abs/2206.06054v1
Date: Mon, 13 Jun 2022 11:35:55 GMT
Title: Specifying and Testing $k$-Safety Properties for Machine-Learning Models
Authors: Maria Christakis, Hasan Ferit Eniser, J\"org Hoffmann, Adish Singla, Valentin W\"ustholz
Abstract summary: We take inspiration from specifications used in formal methods, expressing $k$ different executions, so-called $k$-safety properties. Here, we show the wide applicability of $k$-safety properties for machine-learning models and present the first specification language for expressing them. Our framework is effective in identifying property violations, and that detected bugs could be used to train better models.
Score: 20.24045879238586
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine-learning models are becoming increasingly prevalent in our lives, for instance assisting in image-classification or decision-making tasks. Consequently, the reliability of these models is of critical importance and has resulted in the development of numerous approaches for validating and verifying their robustness and fairness. However, beyond such specific properties, it is challenging to specify, let alone check, general functional-correctness expectations from models. In this paper, we take inspiration from specifications used in formal methods, expressing functional-correctness properties by reasoning about $k$ different executions, so-called $k$-safety properties. Considering a credit-screening model of a bank, the expected property that "if a person is denied a loan and their income decreases, they should still be denied the loan" is a 2-safety property. Here, we show the wide applicability of $k$-safety properties for machine-learning models and present the first specification language for expressing them. We also operationalize the language in a framework for automatically validating such properties using metamorphic testing. Our experiments show that our framework is effective in identifying property violations, and that detected bugs could be used to train better models.

Related papers

Holmes: Towards Effective and Harmless Model Ownership Verification to Personalized Large Vision Models via Decoupling Common Features [54.63343151319368]
This paper proposes a harmless model ownership verification method for personalized models by decoupling similar common features.<n>In the first stage, we create shadow models that retain common features of the victim model while disrupting dataset-specific features.<n>After that, a meta-classifier is trained to identify stolen models by determining whether suspicious models contain the dataset-specific features of the victim.
arXiv Detail & Related papers (2025-06-24T15:40:11Z)
Rationales Are Not Silver Bullets: Measuring the Impact of Rationales on Model Performance and Reliability [70.4107059502882]
Training language models with rationales augmentation has been shown to be beneficial in many existing works.<n>We conduct comprehensive investigations to thoroughly inspect the impact of rationales on model performance.
arXiv Detail & Related papers (2025-05-30T02:39:37Z)
Internal Causal Mechanisms Robustly Predict Language Model Out-of-Distribution Behaviors [61.92704516732144]
We show that the most robust features for correctness prediction are those that play a distinctive causal role in the model's behavior.<n>We propose two methods that leverage causal mechanisms to predict the correctness of model outputs.
arXiv Detail & Related papers (2025-05-17T00:31:39Z)
On the Fairness, Diversity and Reliability of Text-to-Image Generative Models [68.62012304574012]
multimodal generative models have sparked critical discussions on their reliability, fairness and potential for misuse.<n>We propose an evaluation framework to assess model reliability by analyzing responses to global and local perturbations in the embedding space.<n>Our method lays the groundwork for detecting unreliable, bias-injected models and tracing the provenance of embedded biases.
arXiv Detail & Related papers (2024-11-21T09:46:55Z)
Prediction without Preclusion: Recourse Verification with Reachable Sets [16.705988489763868]
We introduce a procedure called recourse verification to test if a model assigns fixed predictions to its decision subjects. We conduct a comprehensive empirical study on the infeasibility of recourse on datasets from consumer finance.
arXiv Detail & Related papers (2023-08-24T14:24:04Z)
Trusting Language Models in Education [1.2578554943276923]
We propose to use an XGBoost on top of BERT to output the corrected probabilities. Our hypothesis is that the level of uncertainty contained in the flow of attention is related to the quality of the model's response itself.
arXiv Detail & Related papers (2023-08-07T18:27:54Z)
Investigating Ensemble Methods for Model Robustness Improvement of Text Classifiers [66.36045164286854]
We analyze a set of existing bias features and demonstrate there is no single model that works best for all the cases. By choosing an appropriate bias model, we can obtain a better robustness result than baselines with a more sophisticated model design.
arXiv Detail & Related papers (2022-10-28T17:52:10Z)
MOVE: Effective and Harmless Ownership Verification via Embedded External Features [109.19238806106426]
We propose an effective and harmless model ownership verification (MOVE) to defend against different types of model stealing simultaneously. We conduct the ownership verification by verifying whether a suspicious model contains the knowledge of defender-specified external features. In particular, we develop our MOVE method under both white-box and black-box settings to provide comprehensive model protection.
arXiv Detail & Related papers (2022-08-04T02:22:29Z)
Plex: Towards Reliability using Pretrained Large Model Extensions [69.13326436826227]
We develop ViT-Plex and T5-Plex, pretrained large model extensions for vision and language modalities, respectively. Plex greatly improves the state-of-the-art across reliability tasks, and simplifies the traditional protocol. We demonstrate scaling effects over model sizes up to 1B parameters and pretraining dataset sizes up to 4B examples.
arXiv Detail & Related papers (2022-07-15T11:39:37Z)
Predicting is not Understanding: Recognizing and Addressing Underspecification in Machine Learning [47.651130958272155]
Underspecification refers to the existence of multiple models that are indistinguishable in their in-domain accuracy. We formalize the concept of underspecification and propose a method to identify and partially address it.
arXiv Detail & Related papers (2022-07-06T11:20:40Z)
Defending against Model Stealing via Verifying Embedded External Features [90.29429679125508]
adversaries can steal' deployed models even when they have no training samples and can not get access to the model parameters or structures. We explore the defense from another angle by verifying whether a suspicious model contains the knowledge of defender-specified emphexternal features. Our method is effective in detecting different types of model stealing simultaneously, even if the stolen model is obtained via a multi-stage stealing process.
arXiv Detail & Related papers (2021-12-07T03:51:54Z)
Scaling up Memory-Efficient Formal Verification Tools for Tree Ensembles [2.588973722689844]
We formalise and extend the VoTE algorithm presented earlier as a tool description. We show how the separation of property checking from the core verification engine enables verification of versatile requirements. We demonstrate the application of the tool in two case studies, namely digit recognition and aircraft collision avoidance.
arXiv Detail & Related papers (2021-05-06T11:50:22Z)
DirectDebug: Automated Testing and Debugging of Feature Models [55.41644538483948]
Variability models (e.g., feature models) are a common way for the representation of variabilities and commonalities of software artifacts. Complex and often large-scale feature models can become faulty, i.e., do not represent the expected variability properties of the underlying software artifact.
arXiv Detail & Related papers (2021-02-11T11:22:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.