A Comparison for Anti-noise Robustness of Deep Learning Classification
Methods on a Tiny Object Image Dataset: from Convolutional Neural Network to
Visual Transformer and Performer
- URL: http://arxiv.org/abs/2106.01927v1
- Date: Thu, 3 Jun 2021 15:28:17 GMT
- Title: A Comparison for Anti-noise Robustness of Deep Learning Classification
Methods on a Tiny Object Image Dataset: from Convolutional Neural Network to
Visual Transformer and Performer
- Authors: Ao Chen, Chen Li, Haoyuan Chen, Hechen Yang, Peng Zhao, Weiming Hu,
Wanli Liu, Shuojia Zou, and Marcin Grzegorzek
- Abstract summary: We first briefly review the development of Convolutional Neural Network and Visual Transformer in deep learning.
We then use various models of Convolutional Neural Network and Visual Transformer to conduct a series of experiments on the image dataset of tiny objects.
We discuss the problems in the classification of tiny objects and make a prospect for the classification of tiny objects in the future.
- Score: 27.023667473278266
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image classification has achieved unprecedented advance with the the rapid
development of deep learning. However, the classification of tiny object images
is still not well investigated. In this paper, we first briefly review the
development of Convolutional Neural Network and Visual Transformer in deep
learning, and introduce the sources and development of conventional noises and
adversarial attacks. Then we use various models of Convolutional Neural Network
and Visual Transformer to conduct a series of experiments on the image dataset
of tiny objects (sperms and impurities), and compare various evaluation metrics
in the experimental results to obtain a model with stable performance. Finally,
we discuss the problems in the classification of tiny objects and make a
prospect for the classification of tiny objects in the future.
Related papers
- Are Deep Learning Models Robust to Partial Object Occlusion in Visual Recognition Tasks? [4.9260675787714]
Image classification models, including convolutional neural networks (CNNs), perform well on a variety of classification tasks but struggle under partial occlusion.
We contribute the Image Recognition Under Occlusion (IRUO) dataset, based on the recently developed Occluded Video Instance (IRUO) dataset (arXiv:2102.01558)
We find that modern CNN-based models show improved recognition accuracy on occluded images compared to earlier CNN-based models, and ViT-based models are more accurate than CNN-based models on occluded images.
arXiv Detail & Related papers (2024-09-16T23:21:22Z) - Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - Finding Differences Between Transformers and ConvNets Using
Counterfactual Simulation Testing [82.67716657524251]
We present a counterfactual framework that allows us to study the robustness of neural networks with respect to naturalistic variations.
Our method allows for a fair comparison of the robustness of recently released, state-of-the-art Convolutional Neural Networks and Vision Transformers.
arXiv Detail & Related papers (2022-11-29T18:59:23Z) - A Comprehensive Study of Image Classification Model Sensitivity to
Foregrounds, Backgrounds, and Visual Attributes [58.633364000258645]
We call this dataset RIVAL10 consisting of roughly $26k$ instances over $10$ classes.
We evaluate the sensitivity of a broad set of models to noise corruptions in foregrounds, backgrounds and attributes.
In our analysis, we consider diverse state-of-the-art architectures (ResNets, Transformers) and training procedures (CLIP, SimCLR, DeiT, Adversarial Training)
arXiv Detail & Related papers (2022-01-26T06:31:28Z) - Experience feedback using Representation Learning for Few-Shot Object
Detection on Aerial Images [2.8560476609689185]
The performance of our method is assessed on DOTA, a large-scale remote sensing images dataset.
It highlights in particular some intrinsic weaknesses for the few-shot object detection task.
arXiv Detail & Related papers (2021-09-27T13:04:53Z) - A Comparison of Deep Learning Classification Methods on Small-scale
Image Data set: from Converlutional Neural Networks to Visual Transformers [18.58928427116305]
This article explains the application and characteristics of convolutional neural networks and visual transformers.
A series of experiments are carried out on the small datasets by using various models.
The recommended deep learning model is given according to the model application environment.
arXiv Detail & Related papers (2021-07-16T04:13:10Z) - Ensembling with Deep Generative Views [72.70801582346344]
generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose.
Here, we investigate whether such views can be applied to real images to benefit downstream analysis tasks such as image classification.
We use StyleGAN2 as the source of generative augmentations and investigate this setup on classification tasks involving facial attributes, cat faces, and cars.
arXiv Detail & Related papers (2021-04-29T17:58:35Z) - Unsupervised Learning of 3D Object Categories from Videos in the Wild [75.09720013151247]
We focus on learning a model from multiple views of a large collection of object instances.
We propose a new neural network design, called warp-conditioned ray embedding (WCR), which significantly improves reconstruction.
Our evaluation demonstrates performance improvements over several deep monocular reconstruction baselines on existing benchmarks.
arXiv Detail & Related papers (2021-03-30T17:57:01Z) - Understanding invariance via feedforward inversion of discriminatively
trained classifiers [30.23199531528357]
Past research has discovered that some extraneous visual detail remains in the output logits.
We develop a feedforward inversion model that produces remarkably high fidelity reconstructions.
Our approach is based on BigGAN, with conditioning on logits instead of one-hot class labels.
arXiv Detail & Related papers (2021-03-15T17:56:06Z) - Comparative evaluation of CNN architectures for Image Caption Generation [1.2183405753834562]
We have evaluated 17 different Convolutional Neural Networks on two popular Image Caption Generation frameworks.
We observe that model complexity of Convolutional Neural Network, as measured by number of parameters, and the accuracy of the model on Object Recognition task does not necessarily co-relate with its efficacy on feature extraction for Image Caption Generation task.
arXiv Detail & Related papers (2021-02-23T05:43:54Z) - Adversarially-Trained Deep Nets Transfer Better: Illustration on Image
Classification [53.735029033681435]
Transfer learning is a powerful methodology for adapting pre-trained deep neural networks on image recognition tasks to new domains.
In this work, we demonstrate that adversarially-trained models transfer better than non-adversarially-trained models.
arXiv Detail & Related papers (2020-07-11T22:48:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.