Related papers: Fast and Accurate Object Detection on Asymmetrical Receptive Field

Fast and Accurate Object Detection on Asymmetrical Receptive Field

URL: http://arxiv.org/abs/2303.08995v1
Date: Wed, 15 Mar 2023 23:59:18 GMT
Title: Fast and Accurate Object Detection on Asymmetrical Receptive Field
Authors: Liguo Zhou, Tianhao Lin, Alois Knoll
Abstract summary: This article proposes methods for improving object detection accuracy from the perspective of changing receptive fields. The structure of the head part of YOLOv5 is modified by adding asymmetrical pooling layers. The performances of the new model in this article are compared with original YOLOv5 model and analyzed from several parameters.
Score: 4.392212820170972
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Object detection has been used in a wide range of industries. For example, in autonomous driving, the task of object detection is to accurately and efficiently identify and locate a large number of predefined classes of object instances (vehicles, pedestrians, traffic signs, etc.) from videos of roads. In robotics, the industry robot needs to recognize specific machine elements. In the security field, the camera should accurately recognize each face of people. With the wide application of deep learning, the accuracy and efficiency of object detection have been greatly improved, but object detection based on deep learning still faces challenges. Different applications of object detection have different requirements, including highly accurate detection, multi-category object detection, real-time detection, robustness to occlusions, etc. To address the above challenges, based on extensive literature research, this paper analyzes methods for improving and optimizing mainstream object detection algorithms from the perspective of evolution of one-stage and two-stage object detection algorithms. Furthermore, this article proposes methods for improving object detection accuracy from the perspective of changing receptive fields. The new model is based on the original YOLOv5 (You Look Only Once) with some modifications. The structure of the head part of YOLOv5 is modified by adding asymmetrical pooling layers. As a result, the accuracy of the algorithm is improved while ensuring the speed. The performances of the new model in this article are compared with original YOLOv5 model and analyzed from several parameters. And the evaluation of the new model is presented in four situations. Moreover, the summary and outlooks are made on the problems to be solved and the research directions in the future.

Related papers

Oriented Tiny Object Detection: A Dataset, Benchmark, and Dynamic Unbiased Learning [51.170479006249195]
We introduce a new dataset, benchmark, and a dynamic coarse-to-fine learning scheme in this study. Our proposed dataset, AI-TOD-R, features the smallest object sizes among all oriented object detection datasets. We present a benchmark spanning a broad range of detection paradigms, including both fully-supervised and label-efficient approaches.
arXiv Detail & Related papers (2024-12-16T09:14:32Z)
Accelerating Object Detection with YOLOv4 for Real-Time Applications [0.276240219662896]
Convolutional Neural Network (CNN) have emerged as a powerful tool for recognizing image content and in computer vision approach for most problems. This paper introduces the brief introduction of deep learning and object detection framework like Convolutional Neural Network(CNN)
arXiv Detail & Related papers (2024-10-17T17:44:57Z)
Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head. The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement. This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z)
SalienDet: A Saliency-based Feature Enhancement Algorithm for Object Detection for Autonomous Driving [160.57870373052577]
We propose a saliency-based OD algorithm (SalienDet) to detect unknown objects. Our SalienDet utilizes a saliency-based algorithm to enhance image features for object proposal generation. We design a dataset relabeling approach to differentiate the unknown objects from all objects in training sample set to achieve Open-World Detection.
arXiv Detail & Related papers (2023-05-11T16:19:44Z)
Generalized Few-Shot 3D Object Detection of LiDAR Point Cloud for Autonomous Driving [91.39625612027386]
We propose a novel task, called generalized few-shot 3D object detection, where we have a large amount of training data for common (base) objects, but only a few data for rare (novel) classes. Specifically, we analyze in-depth differences between images and point clouds, and then present a practical principle for the few-shot setting in the 3D LiDAR dataset. To solve this task, we propose an incremental fine-tuning method to extend existing 3D detection models to recognize both common and rare objects.
arXiv Detail & Related papers (2023-02-08T07:11:36Z)
You Better Look Twice: a new perspective for designing accurate detectors with reduced computations [56.34005280792013]
BLT-net is a new low-computation two-stage object detection architecture. It reduces computations by separating objects from background using a very lite first-stage. Resulting image proposals are then processed in the second-stage by a highly accurate model.
arXiv Detail & Related papers (2021-07-21T12:39:51Z)
Unsupervised Domain Adaption of Object Detectors: A Survey [87.08473838767235]
Recent advances in deep learning have led to the development of accurate and efficient models for various computer vision applications. Learning highly accurate models relies on the availability of datasets with a large number of annotated images. Due to this, model performance drops drastically when evaluated on label-scarce datasets having visually distinct images.
arXiv Detail & Related papers (2021-05-27T23:34:06Z)
Improved detection of small objects in road network sequences [0.0]
We propose a new procedure for detecting small-scale objects by applying super-resolution processes based on detections performed by convolutional neural networks. This work has been tested for a set of traffic images containing elements of different scales to test the efficiency according to the detections obtained by the model.
arXiv Detail & Related papers (2021-05-18T10:13:23Z)
Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects. For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z)
Real Time Multi-Class Object Detection and Recognition Using Vision Augmentation Algorithm [0.0]
We introduce a novel real time detection algorithm which employs upsampling and skip connection to extract multiscale features at different convolution levels in a learning task. The detection precision of the model is shown to be higher and faster than that of the state-of-the-art models.
arXiv Detail & Related papers (2020-03-17T01:08:24Z)
Real-Time Object Detection and Recognition on Low-Compute Humanoid Robots using Deep Learning [0.12599533416395764]
We describe a novel architecture that enables multiple low-compute NAO robots to perform real-time detection, recognition and localization of objects in its camera view. The proposed algorithm for object detection and localization is an empirical modification of YOLOv3, based on indoor experiments in multiple scenarios. The architecture also comprises of an effective end-to-end pipeline to feed the real-time frames from the camera feed to the neural net and use its results for guiding the robot.
arXiv Detail & Related papers (2020-01-20T05:24:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.