Data Synthesis for Testing Black-Box Machine Learning Models
- URL: http://arxiv.org/abs/2111.02161v1
- Date: Wed, 3 Nov 2021 12:00:30 GMT
- Title: Data Synthesis for Testing Black-Box Machine Learning Models
- Authors: Diptikalyan Saha, Aniya Aggarwal, Sandeep Hans
- Abstract summary: The increasing usage of machine learning models raises the question of the reliability of these models.
In this paper, we provide a framework for automated test data synthesis to test black-box ML/DL models.
- Score: 2.3800397174740984
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The increasing usage of machine learning models raises the question of the
reliability of these models. The current practice of testing with limited data
is often insufficient. In this paper, we provide a framework for automated test
data synthesis to test black-box ML/DL models. We address an important
challenge of generating realistic user-controllable data with model agnostic
coverage criteria to test a varied set of properties, essentially to increase
trust in machine learning models. We experimentally demonstrate the
effectiveness of our technique.
Related papers
- Context-Aware Testing: A New Paradigm for Model Testing with Large Language Models [49.06068319380296]
We introduce context-aware testing (CAT) which uses context as an inductive bias to guide the search for meaningful model failures.
We instantiate the first CAT system, SMART Testing, which employs large language models to hypothesize relevant and likely failures.
arXiv Detail & Related papers (2024-10-31T15:06:16Z) - Can You Rely on Your Model Evaluation? Improving Model Evaluation with
Synthetic Test Data [75.20035991513564]
We introduce 3S Testing, a deep generative modeling framework to facilitate model evaluation.
Our experiments demonstrate that 3S Testing outperforms traditional baselines.
These results raise the question of whether we need a paradigm shift away from limited real test data towards synthetic test data.
arXiv Detail & Related papers (2023-10-25T10:18:44Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - Learning to Increase the Power of Conditional Randomization Tests [8.883733362171032]
The model-X conditional randomization test is a generic framework for conditional independence testing.
We introduce novel model-fitting schemes that are designed to explicitly improve the power of model-X tests.
arXiv Detail & Related papers (2022-07-03T12:29:25Z) - Learning continuous models for continuous physics [94.42705784823997]
We develop a test based on numerical analysis theory to validate machine learning models for science and engineering applications.
Our results illustrate how principled numerical analysis methods can be coupled with existing ML training/testing methodologies to validate models for science and engineering applications.
arXiv Detail & Related papers (2022-02-17T07:56:46Z) - Comparing Test Sets with Item Response Theory [53.755064720563]
We evaluate 29 datasets using predictions from 18 pretrained Transformer models on individual test examples.
We find that Quoref, HellaSwag, and MC-TACO are best suited for distinguishing among state-of-the-art models.
We also observe span selection task format, which is used for QA datasets like QAMR or SQuAD2.0, is effective in differentiating between strong and weak models.
arXiv Detail & Related papers (2021-06-01T22:33:53Z) - ALT-MAS: A Data-Efficient Framework for Active Testing of Machine
Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data.
The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z) - Testing Framework for Black-box AI Models [1.916485402892365]
In this paper, we present an end-to-end generic framework for testing AI Models.
Our tool has been used for testing industrial AI models and was very effective to uncover issues.
arXiv Detail & Related papers (2021-02-11T18:15:23Z) - Fairness in the Eyes of the Data: Certifying Machine-Learning Models [38.09830406613629]
We present a framework that allows to certify the fairness degree of a model based on an interactive and privacy-preserving test.
We tackle two scenarios, where either the test data is privately available only to the tester or is publicly known in advance, even to the model creator.
We provide a cryptographic technique to automate fairness testing and certified inference with only black-box access to the model at hand while hiding the participants' sensitive data.
arXiv Detail & Related papers (2020-09-03T09:22:39Z) - Testing Monotonicity of Machine Learning Models [0.5330240017302619]
We propose verification-based testing of monotonicity, i.e., the formal computation of test inputs on a white-box model via verification technology.
On the white-box model, the space of test inputs can be systematically explored by a directed computation of test cases.
The empirical evaluation on 90 black-box models shows verification-based testing can outperform adaptive random testing as well as property-based techniques with respect to effectiveness and efficiency.
arXiv Detail & Related papers (2020-02-27T17:38:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.