Improving Performance of Object Detection using the Mechanisms of Visual
Recognition in Humans
- URL: http://arxiv.org/abs/2301.09667v1
- Date: Mon, 23 Jan 2023 19:09:36 GMT
- Title: Improving Performance of Object Detection using the Mechanisms of Visual
Recognition in Humans
- Authors: Amir Ghasemi, Fatemeh Mottaghian, Akram Bayat
- Abstract summary: We first track the performance of the state-of-the-art deep object recognition network, Faster- RCNN, as a function of image resolution.
They also show that different spatial frequencies convey different information about the objects in recognition process.
We propose a multi-resolution object recognition framework rather than a single-resolution network.
- Score: 0.4297070083645048
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Object recognition systems are usually trained and evaluated on high
resolution images. However, in real world applications, it is common that the
images have low resolutions or have small sizes. In this study, we first track
the performance of the state-of-the-art deep object recognition network,
Faster- RCNN, as a function of image resolution. The results reveals negative
effects of low resolution images on recognition performance. They also show
that different spatial frequencies convey different information about the
objects in recognition process. It means multi-resolution recognition system
can provides better insight into optimal selection of features that results in
better recognition of objects. This is similar to the mechanisms of the human
visual systems that are able to implement multi-scale representation of a
visual scene simultaneously. Then, we propose a multi-resolution object
recognition framework rather than a single-resolution network. The proposed
framework is evaluated on the PASCAL VOC2007 database. The experimental results
show the performance of our adapted multi-resolution Faster-RCNN framework
outperforms the single-resolution Faster-RCNN on input images with various
resolutions with an increase in the mean Average Precision (mAP) of 9.14%
across all resolutions and 1.2% on the full-spectrum images. Furthermore, the
proposed model yields robustness of the performance over a wide range of
spatial frequencies.
Related papers
- Parameter-Inverted Image Pyramid Networks [49.35689698870247]
We propose a novel network architecture known as the Inverted Image Pyramid Networks (PIIP)
Our core idea is to use models with different parameter sizes to process different resolution levels of the image pyramid.
PIIP achieves superior performance in tasks such as object detection, segmentation, and image classification.
arXiv Detail & Related papers (2024-06-06T17:59:10Z) - ResFormer: Scaling ViTs with Multi-Resolution Training [100.01406895070693]
We introduce ResFormer, a framework for improved performance on a wide spectrum of, mostly unseen, testing resolutions.
In particular, ResFormer operates on replicated images of different resolutions and enforces a scale consistency loss to engage interactive information across different scales.
We demonstrate, moreover, ResFormer is flexible and can be easily extended to semantic segmentation, object detection and video action recognition.
arXiv Detail & Related papers (2022-12-01T18:57:20Z) - Super-Resolution and Image Re-projection for Iris Recognition [67.42500312968455]
Convolutional Neural Networks (CNNs) using different deep learning approaches attempt to recover realistic texture and fine grained details from low resolution images.
In this work we explore the viability of these approaches for iris Super-Resolution (SR) in an iris recognition environment.
Results show that CNNs and image re-projection can improve the results specially for the accuracy of recognition systems.
arXiv Detail & Related papers (2022-10-20T09:46:23Z) - Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network.
We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z) - Analysis and evaluation of Deep Learning based Super-Resolution
algorithms to improve performance in Low-Resolution Face Recognition [0.0]
Super-resolution algorithms may be able to recover the discriminant properties of the subjects involved.
This project aimed at evaluating and adapting different deep neural network architectures for the task of face super-resolution.
Experiments showed that general super-resolution architectures might enhance face verification performance of deep neural networks trained on high-resolution faces.
arXiv Detail & Related papers (2021-01-19T02:41:57Z) - High Quality Remote Sensing Image Super-Resolution Using Deep Memory
Connected Network [21.977093907114217]
Single image super-resolution is crucial for many applications such as target detection and image classification.
We propose a novel method named deep memory connected network (DMCN) based on a convolutional neural network to reconstruct high-quality super-resolution images.
arXiv Detail & Related papers (2020-10-01T15:06:02Z) - Multi-image Super Resolution of Remotely Sensed Images using Residual
Feature Attention Deep Neural Networks [1.3764085113103222]
The presented research proposes a novel residual attention model (RAMS) that efficiently tackles the multi-image super-resolution task.
We introduce the mechanism of visual feature attention with 3D convolutions in order to obtain an aware data fusion and information extraction.
Our representation learning network makes extensive use of nestled residual connections to let flow redundant low-frequency signals.
arXiv Detail & Related papers (2020-07-06T22:54:02Z) - Feature Super-Resolution Based Facial Expression Recognition for
Multi-scale Low-Resolution Faces [7.634398926381845]
Super-resolution method is often used to enhance low-resolution images, but the performance on FER task is limited when on images of very low resolution.
In this work, inspired by feature super-resolution methods for object detection, we proposed a novel generative adversary network-based super-resolution method for robust facial expression recognition.
arXiv Detail & Related papers (2020-04-05T15:38:47Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z) - Cross-Resolution Adversarial Dual Network for Person Re-Identification
and Beyond [59.149653740463435]
Person re-identification (re-ID) aims at matching images of the same person across camera views.
Due to varying distances between cameras and persons of interest, resolution mismatch can be expected.
We propose a novel generative adversarial network to address cross-resolution person re-ID.
arXiv Detail & Related papers (2020-02-19T07:21:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.