Related papers: Using Metamorphic Relations to Verify and Enhance Artcode Classification

Using Metamorphic Relations to Verify and Enhance Artcode Classification

URL: http://arxiv.org/abs/2108.02694v1
Date: Thu, 5 Aug 2021 15:54:56 GMT
Title: Using Metamorphic Relations to Verify and Enhance Artcode Classification
Authors: Liming Xu, Dave Towey, Andrew French, Steve Benford, Zhi Quan Zhou and Tsong Yueh Chen
Abstract summary: An example of an area facing the oracle problem is automatic image classification, using machine learning to classify an input image as one of a set of predefined classes. An approach to software testing that alleviates the oracle problem is metamorphic testing (MT) This paper examines the problem of classifying images containing visually hidden markers called Artcodes, and applies MT to verify and enhance the trained classifiers.
Score: 39.36253474867746
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Software testing is often hindered where it is impossible or impractical to determine the correctness of the behaviour or output of the software under test (SUT), a situation known as the oracle problem. An example of an area facing the oracle problem is automatic image classification, using machine learning to classify an input image as one of a set of predefined classes. An approach to software testing that alleviates the oracle problem is metamorphic testing (MT). While traditional software testing examines the correctness of individual test cases, MT instead examines the relations amongst multiple executions of test cases and their outputs. These relations are called metamorphic relations (MRs): if an MR is found to be violated, then a fault must exist in the SUT. This paper examines the problem of classifying images containing visually hidden markers called Artcodes, and applies MT to verify and enhance the trained classifiers. This paper further examines two MRs, Separation and Occlusion, and reports on their capability in verifying the image classification using one-way analysis of variance (ANOVA) in conjunction with three other statistical analysis methods: t-test (for unequal variances), Kruskal-Wallis test, and Dunnett's test. In addition to our previously-studied classifier, that used Random Forests, we introduce a new classifier that uses a support vector machine, and present its MR-augmented version. Experimental evaluations across a number of performance metrics show that the augmented classifiers can achieve better performance than non-augmented classifiers. This paper also analyses how the enhanced performance is obtained.

Related papers

Deep anytime-valid hypothesis testing [29.273915933729057]
We propose a general framework for constructing powerful, sequential hypothesis tests for nonparametric testing problems. We develop a principled approach of leveraging the representation capability of machine learning models within the testing-by-betting framework. Empirical results on synthetic and real-world datasets demonstrate that tests instantiated using our general framework are competitive against specialized baselines.
arXiv Detail & Related papers (2023-10-30T09:46:19Z)
Zero-shot Model Diagnosis [80.36063332820568]
A common approach to evaluate deep learning models is to build a labeled test set with attributes of interest and assess how well it performs. This paper argues the case that Zero-shot Model Diagnosis (ZOOM) is possible without the need for a test set nor labeling.
arXiv Detail & Related papers (2023-03-27T17:59:33Z)
Active Sequential Two-Sample Testing [18.99517340397671]
We consider the two-sample testing problem in a new scenario where sample measurements are inexpensive to access. We devise the first emphactiveNIST-sample testing framework that not only sequentially but also emphactively queries. In practice, we introduce an instantiation of our framework and evaluate it using several experiments.
arXiv Detail & Related papers (2023-01-30T02:23:49Z)
Parametric Classification for Generalized Category Discovery: A Baseline Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples. We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem. We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z)
Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-Spoofing [72.4445825335561]
We propose a simple method to derive 2D representation from detection scores produced by an arbitrary set of binary classifiers. Based upon rank correlations, our method facilitates a visual comparison of classifiers with arbitrary scores. While the approach is fully versatile and can be applied to any detection task, we demonstrate the method using scores produced by automatic speaker verification and voice anti-spoofing systems.
arXiv Detail & Related papers (2021-06-11T13:03:33Z)
Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle. In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize. Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z)
Evaluating and Mitigating Bias in Image Classifiers: A Causal Perspective Using Counterfactuals [27.539001365348906]
We present a method for generating counterfactuals by incorporating a structural causal model (SCM) in an improved variant of Adversarially Learned Inference (ALI) We show how to explain a pre-trained machine learning classifier, evaluate its bias, and mitigate the bias using a counterfactual regularizer.
arXiv Detail & Related papers (2020-09-17T13:19:31Z)
I Am Going MAD: Maximum Discrepancy Competition for Comparing Classifiers Adaptively [135.7695909882746]
We name the MAximum Discrepancy (MAD) competition. We adaptively sample a small test set from an arbitrarily large corpus of unlabeled images. Human labeling on the resulting model-dependent image sets reveals the relative performance of the competing classifiers.
arXiv Detail & Related papers (2020-02-25T03:32:29Z)
Object-based Metamorphic Testing through Image Structuring [0.6445605125467573]
Testing software is often costly due to the need of mass-producing test cases and providing a test oracle for it. One method that has been proposed in order to alleviate the oracle problem is metamorphic testing.
arXiv Detail & Related papers (2020-02-12T10:32:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.