Vision Checklist: Towards Testable Error Analysis of Image Models to
Help System Designers Interrogate Model Capabilities
- URL: http://arxiv.org/abs/2201.11674v3
- Date: Mon, 31 Jan 2022 11:09:19 GMT
- Title: Vision Checklist: Towards Testable Error Analysis of Image Models to
Help System Designers Interrogate Model Capabilities
- Authors: Xin Du, Benedicte Legastelois, Bhargavi Ganesh, Ajitha Rajan, Hana
Chockler, Vaishak Belle, Stuart Anderson, Subramanian Ramamoorthy
- Abstract summary: Vision Checklist is a framework aimed at interrogating the capabilities of a model in order to produce a report that can be used by a system designer for robustness evaluations.
Our framework is evaluated on multiple datasets like Tinyimagenet, CIFAR10, CIFAR100 and Camelyon17 and for models like ViT and Resnet.
- Score: 26.177391265710362
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Using large pre-trained models for image recognition tasks is becoming
increasingly common owing to the well acknowledged success of recent models
like vision transformers and other CNN-based models like VGG and Resnet. The
high accuracy of these models on benchmark tasks has translated into their
practical use across many domains including safety-critical applications like
autonomous driving and medical diagnostics. Despite their widespread use, image
models have been shown to be fragile to changes in the operating environment,
bringing their robustness into question. There is an urgent need for methods
that systematically characterise and quantify the capabilities of these models
to help designers understand and provide guarantees about their safety and
robustness. In this paper, we propose Vision Checklist, a framework aimed at
interrogating the capabilities of a model in order to produce a report that can
be used by a system designer for robustness evaluations. This framework
proposes a set of perturbation operations that can be applied on the underlying
data to generate test samples of different types. The perturbations reflect
potential changes in operating environments, and interrogate various properties
ranging from the strictly quantitative to more qualitative. Our framework is
evaluated on multiple datasets like Tinyimagenet, CIFAR10, CIFAR100 and
Camelyon17 and for models like ViT and Resnet. Our Vision Checklist proposes a
specific set of evaluations that can be integrated into the previously proposed
concept of a model card. Robustness evaluations like our checklist will be
crucial in future safety evaluations of visual perception modules, and be
useful for a wide range of stakeholders including designers, deployers, and
regulators involved in the certification of these systems. Source code of
Vision Checklist would be open for public use.
Related papers
- BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation [57.40024206484446]
We introduce the BEHAVIOR Vision Suite (BVS), a set of tools and assets to generate fully customized synthetic data for systematic evaluation of computer vision models.
BVS supports a large number of adjustable parameters at the scene level.
We showcase three example application scenarios.
arXiv Detail & Related papers (2024-05-15T17:57:56Z) - AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving [68.73885845181242]
We propose an Automatic Data Engine (AIDE) that automatically identifies issues, efficiently curates data, improves the model through auto-labeling, and verifies the model through generation of diverse scenarios.
We further establish a benchmark for open-world detection on AV datasets to comprehensively evaluate various learning paradigms, demonstrating our method's superior performance at a reduced cost.
arXiv Detail & Related papers (2024-03-26T04:27:56Z) - Sequential Modeling Enables Scalable Learning for Large Vision Models [120.91839619284431]
We introduce a novel sequential modeling approach which enables learning a Large Vision Model (LVM) without making use of any linguistic data.
We define a common format, "visual sentences", in which we can represent raw images and videos as well as annotated data sources.
arXiv Detail & Related papers (2023-12-01T18:59:57Z) - Visual Analytics for Generative Transformer Models [28.251218916955125]
We present a novel visual analytical framework to support the analysis of transformer-based generative networks.
Our framework is one of the first dedicated to supporting the analysis of transformer-based encoder-decoder models.
arXiv Detail & Related papers (2023-11-21T08:15:01Z) - Zero-shot Model Diagnosis [80.36063332820568]
A common approach to evaluate deep learning models is to build a labeled test set with attributes of interest and assess how well it performs.
This paper argues the case that Zero-shot Model Diagnosis (ZOOM) is possible without the need for a test set nor labeling.
arXiv Detail & Related papers (2023-03-27T17:59:33Z) - ComplAI: Theory of A Unified Framework for Multi-factor Assessment of
Black-Box Supervised Machine Learning Models [6.279863832853343]
ComplAI is a unique framework to enable, observe, analyze and quantify explainability, robustness, performance, fairness, and model behavior.
It evaluates different supervised Machine Learning models not just from their ability to make correct predictions but from overall responsibility perspective.
arXiv Detail & Related papers (2022-12-30T08:48:19Z) - ViewFool: Evaluating the Robustness of Visual Recognition to Adversarial
Viewpoints [42.64942578228025]
We propose a novel method called ViewFool to find adversarial viewpoints that mislead visual recognition models.
By encoding real-world objects as neural radiance fields (NeRF), ViewFool characterizes a distribution of diverse adversarial viewpoints.
arXiv Detail & Related papers (2022-10-08T03:06:49Z) - Advancing Plain Vision Transformer Towards Remote Sensing Foundation
Model [97.9548609175831]
We resort to plain vision transformers with about 100 million parameters and make the first attempt to propose large vision models customized for remote sensing tasks.
Specifically, to handle the large image size and objects of various orientations in RS images, we propose a new rotated varied-size window attention.
Experiments on detection tasks demonstrate the superiority of our model over all state-of-the-art models, achieving 81.16% mAP on the DOTA-V1.0 dataset.
arXiv Detail & Related papers (2022-08-08T09:08:40Z) - An Intelligent Hybrid Model for Identity Document Classification [0.0]
Digitization may provide opportunities (e.g., increase in productivity, disaster recovery, and environmentally friendly solutions) and challenges for businesses.
One of the main challenges would be to accurately classify numerous scanned documents uploaded every day by customers.
There are not many studies available to address the challenge as an application of image classification.
The proposed approach has been implemented using Python and experimentally validated on synthetic and real-world datasets.
arXiv Detail & Related papers (2021-06-07T13:08:00Z) - Incorporating Vision Bias into Click Models for Image-oriented Search
Engine [51.192784793764176]
In this paper, we assume that vision bias exists in an image-oriented search engine as another crucial factor affecting the examination probability aside from position.
We use regression-based EM algorithm to predict the vision bias given the visual features extracted from candidate documents.
arXiv Detail & Related papers (2021-01-07T10:01:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.