Systematic Training and Testing for Machine Learning Using Combinatorial
Interaction Testing
- URL: http://arxiv.org/abs/2201.12428v1
- Date: Fri, 28 Jan 2022 21:33:31 GMT
- Title: Systematic Training and Testing for Machine Learning Using Combinatorial
Interaction Testing
- Authors: Tyler Cody, Erin Lanus, Daniel D. Doyle, Laura Freeman
- Abstract summary: This paper demonstrates the systematic use of coverage for selecting and characterizing test and training sets for machine learning models.
The paper addresses prior criticism of coverage and provides a rebuttal which advocates the use of coverage metrics in machine learning applications.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper demonstrates the systematic use of combinatorial coverage for
selecting and characterizing test and training sets for machine learning
models. The presented work adapts combinatorial interaction testing, which has
been successfully leveraged in identifying faults in software testing, to
characterize data used in machine learning. The MNIST hand-written digits data
is used to demonstrate that combinatorial coverage can be used to select test
sets that stress machine learning model performance, to select training sets
that lead to robust model performance, and to select data for fine-tuning
models to new domains. Thus, the results posit combinatorial coverage as a
holistic approach to training and testing for machine learning. In contrast to
prior work which has focused on the use of coverage in regard to the internal
of neural networks, this paper considers coverage over simple features derived
from inputs and outputs. Thus, this paper addresses the case where the supplier
of test and training sets for machine learning models does not have
intellectual property rights to the models themselves. Finally, the paper
addresses prior criticism of combinatorial coverage and provides a rebuttal
which advocates the use of coverage metrics in machine learning applications.
Related papers
- BanditCAT and AutoIRT: Machine Learning Approaches to Computerized Adaptive Testing and Item Calibration [7.261063083251448]
We present a complete framework for calibrating and administering a robust large-scale computerized adaptive test (CAT) with a small number of responses.
We use AutoIRT, a new method that uses automated machine learning (AutoML) in combination with item response theory (IRT)
We propose the BanditCAT framework, a methodology motivated by casting the problem in the contextual bandit framework and utilizing item response theory (IRT)
arXiv Detail & Related papers (2024-10-28T13:54:10Z) - Match me if you can: Semi-Supervised Semantic Correspondence Learning with Unpaired Images [76.47980643420375]
This paper builds on the hypothesis that there is an inherent data-hungry matter in learning semantic correspondences.
We demonstrate a simple machine annotator reliably enriches paired key points via machine supervision.
Our models surpass current state-of-the-art models on semantic correspondence learning benchmarks like SPair-71k, PF-PASCAL, and PF-WILLOW.
arXiv Detail & Related papers (2023-11-30T13:22:15Z) - Provable Robustness for Streaming Models with a Sliding Window [51.85182389861261]
In deep learning applications such as online content recommendation and stock market analysis, models use historical data to make predictions.
We derive robustness certificates for models that use a fixed-size sliding window over the input stream.
Our guarantees hold for the average model performance across the entire stream and are independent of stream size, making them suitable for large data streams.
arXiv Detail & Related papers (2023-03-28T21:02:35Z) - Active Learning with Combinatorial Coverage [0.0]
Active learning is a practical field of machine learning that automates the process of selecting which data to label.
Current methods are effective in reducing the burden of data labeling but are heavily model-reliant.
This has led to the inability of sampled data to be transferred to new models as well as issues with sampling bias.
We propose active learning methods utilizing coverage to overcome these issues.
arXiv Detail & Related papers (2023-02-28T13:43:23Z) - ALBench: A Framework for Evaluating Active Learning in Object Detection [102.81795062493536]
This paper contributes an active learning benchmark framework named as ALBench for evaluating active learning in object detection.
Developed on an automatic deep model training system, this ALBench framework is easy-to-use, compatible with different active learning algorithms, and ensures the same training and testing protocols.
arXiv Detail & Related papers (2022-07-27T07:46:23Z) - Methodology to Create Analysis-Naive Holdout Records as well as Train
and Test Records for Machine Learning Analyses in Healthcare [0.0]
The purpose of the holdout sample is to preserve data for research studies that will be analysis-naive and randomly selected from the full dataset.
The methodology suggested for creating holdouts is a modification of k-fold cross validation, which takes into account randomization and efficiently allows a three-way split.
arXiv Detail & Related papers (2022-05-09T00:51:08Z) - Conformal prediction for the design problem [72.14982816083297]
In many real-world deployments of machine learning, we use a prediction algorithm to choose what data to test next.
In such settings, there is a distinct type of distribution shift between the training and test data.
We introduce a method to quantify predictive uncertainty in such settings.
arXiv Detail & Related papers (2022-02-08T02:59:12Z) - A comprehensive solution to retrieval-based chatbot construction [4.807955518532493]
We present an end-to-end set of solutions to take the reader from an unlabelled chatlogs to a deployed chatbots.
This set of solutions includes creating a self-supervised dataset and a weakly labelled dataset from chatlogs, as well as a systematic approach to selecting a fixed list of canned responses.
We find that using a self-supervised contrastive learning model outperforms training the binary and multi-class classification models on a weakly labelled dataset.
arXiv Detail & Related papers (2021-06-11T02:54:33Z) - ALT-MAS: A Data-Efficient Framework for Active Testing of Machine
Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data.
The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.