Related papers: Outline of an Independent Systematic Blackbox Test for ML-based Systems

Related papers

Instance-Level Data-Use Auditing of Visual ML Models [47.369572284751285]
Growing trend of legal disputes over the unauthorized use of data in machine learning (ML) systems highlights the need for reliable data-use auditing mechanisms. We present the first proactive instance-level data-use auditing method designed to enable data owners to audit the use of their individual data instances in ML models.
arXiv Detail & Related papers (2025-03-28T13:28:57Z)
Predicting the Performance of Black-box LLMs through Self-Queries [60.87193950962585]
Large language models (LLMs) are increasingly relied on in AI systems, predicting when they make mistakes is crucial. In this paper, we extract features of LLMs in a black-box manner by using follow-up prompts and taking the probabilities of different responses as representations. We demonstrate that training a linear model on these low-dimensional representations produces reliable predictors of model performance at the instance level.
arXiv Detail & Related papers (2025-01-02T22:26:54Z)
Context-Aware Testing: A New Paradigm for Model Testing with Large Language Models [49.06068319380296]
We introduce context-aware testing (CAT) which uses context as an inductive bias to guide the search for meaningful model failures. We instantiate the first CAT system, SMART Testing, which employs large language models to hypothesize relevant and likely failures.
arXiv Detail & Related papers (2024-10-31T15:06:16Z)
A General Framework for Data-Use Auditing of ML Models [47.369572284751285]
We propose a general method to audit an ML model for the use of a data-owner's data in training. We show the effectiveness of our proposed framework by applying it to audit data use in two types of ML models.
arXiv Detail & Related papers (2024-07-21T09:32:34Z)
Using Quality Attribute Scenarios for ML Model Test Case Generation [3.9111051646728527]
Current practice for machine learning (ML) model testing prioritizes testing for model performance. This paper presents an approach based on quality attribute (QA) scenarios to elicit and define system- and model-relevant test cases. The QA-based approach has been integrated into MLTE, a process and tool to support ML model test and evaluation.
arXiv Detail & Related papers (2024-06-12T18:26:42Z)
Fusion of Gaussian Processes Predictions with Monte Carlo Sampling [61.31380086717422]
In science and engineering, we often work with models designed for accurate prediction of variables of interest. Recognizing that these models are approximations of reality, it becomes desirable to apply multiple models to the same data and integrate their outcomes.
arXiv Detail & Related papers (2024-03-03T04:21:21Z)
Machine Learning Data Suitability and Performance Testing Using Fault Injection Testing Framework [0.0]
This paper presents the Fault Injection for Undesirable Learning in input Data (FIUL-Data) testing framework. Data mutators explore vulnerabilities of ML systems against the effects of different fault injections. This paper evaluates the framework using data from analytical chemistry, comprising retention time measurements of anti-sense oligonucleotides.
arXiv Detail & Related papers (2023-09-20T12:58:35Z)
Learning continuous models for continuous physics [94.42705784823997]
We develop a test based on numerical analysis theory to validate machine learning models for science and engineering applications. Our results illustrate how principled numerical analysis methods can be coupled with existing ML training/testing methodologies to validate models for science and engineering applications.
arXiv Detail & Related papers (2022-02-17T07:56:46Z)
Predictive machine learning for prescriptive applications: a coupled training-validating approach [77.34726150561087]
We propose a new method for training predictive machine learning models for prescriptive applications. This approach is based on tweaking the validation step in the standard training-validating-testing scheme. Several experiments with synthetic data demonstrate promising results in reducing the prescription costs in both deterministic and real models.
arXiv Detail & Related papers (2021-10-22T15:03:20Z)
ML4ML: Automated Invariance Testing for Machine Learning Models [7.017320068977301]
We propose an automatic testing framework that is applicable to a variety of invariance qualities. We employ machine learning techniques for analysing such imagery'' testing data automatically, hence facilitating ML4ML. Our testing results show that the trained ML4ML assessors can perform such analytical tasks with sufficient accuracy.
arXiv Detail & Related papers (2021-09-27T10:23:44Z)
Transfer Learning without Knowing: Reprogramming Black-box Machine Learning Models with Scarce Data and Limited Resources [78.72922528736011]
We propose a novel approach, black-box adversarial reprogramming (BAR), that repurposes a well-trained black-box machine learning model. Using zeroth order optimization and multi-label mapping techniques, BAR can reprogram a black-box ML model solely based on its input-output responses. BAR outperforms state-of-the-art methods and yields comparable performance to the vanilla adversarial reprogramming method.
arXiv Detail & Related papers (2020-07-17T01:52:34Z)
Testing Monotonicity of Machine Learning Models [0.5330240017302619]
We propose verification-based testing of monotonicity, i.e., the formal computation of test inputs on a white-box model via verification technology. On the white-box model, the space of test inputs can be systematically explored by a directed computation of test cases. The empirical evaluation on 90 black-box models shows verification-based testing can outperform adaptive random testing as well as property-based techniques with respect to effectiveness and efficiency.
arXiv Detail & Related papers (2020-02-27T17:38:06Z)
Manifold for Machine Learning Assurance [9.594432031144716]
We propose an analogous approach for machine-learning (ML) systems using an ML technique that extracts from the high-dimensional training data implicitly describing the required system. It is then harnessed for a range of quality assurance tasks such as test adequacy measurement, test input generation, and runtime monitoring of the target ML system. Preliminary experiments establish that the proposed manifold-based approach, for test adequacy drives diversity in test data, for test generation yields fault-revealing yet realistic test cases, and for runtime monitoring provides an independent means to assess trustability of the target system's output.
arXiv Detail & Related papers (2020-02-08T11:39:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.