Multi-Grid Redundant Bounding Box Annotation for Accurate Object
Detection
- URL: http://arxiv.org/abs/2201.01857v1
- Date: Wed, 5 Jan 2022 23:01:55 GMT
- Title: Multi-Grid Redundant Bounding Box Annotation for Accurate Object
Detection
- Authors: Solomon Negussie Tesema, El-Bay Bourennane
- Abstract summary: YOLOv3 is a state-of-the-art one-shot detector that takes in an input image and divides it into an equal-sized grid matrix.
This paper presents a new mathematical approach that assigns multiple grids per object for accurately tight-fit bounding box prediction.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern leading object detectors are either two-stage or one-stage networks
repurposed from a deep CNN-based backbone classifier network. YOLOv3 is one
such very-well known state-of-the-art one-shot detector that takes in an input
image and divides it into an equal-sized grid matrix. The grid cell having the
center of an object is the one responsible for detecting the particular object.
This paper presents a new mathematical approach that assigns multiple grids per
object for accurately tight-fit bounding box prediction. We also propose an
effective offline copy-paste data augmentation for object detection. Our
proposed method significantly outperforms some current state-of-the-art object
detectors with a prospect for further better performance.
Related papers
- Spatial-Temporal Graph Enhanced DETR Towards Multi-Frame 3D Object Detection [54.041049052843604]
We present STEMD, a novel end-to-end framework that enhances the DETR-like paradigm for multi-frame 3D object detection.
First, to model the inter-object spatial interaction and complex temporal dependencies, we introduce the spatial-temporal graph attention network.
Finally, it poses a challenge for the network to distinguish between the positive query and other highly similar queries that are not the best match.
arXiv Detail & Related papers (2023-07-01T13:53:14Z) - Linear Object Detection in Document Images using Multiple Object
Tracking [58.720142291102135]
Linear objects convey substantial information about document structure.
Many approaches can recover some vector representation, but only one closed-source technique introduced in 1994.
We propose a framework for accurate instance segmentation of linear objects in document images using Multiple Object Tracking.
arXiv Detail & Related papers (2023-05-26T14:22:03Z) - Object Detection in Aerial Images with Uncertainty-Aware Graph Network [61.02591506040606]
We propose a novel uncertainty-aware object detection framework with a structured-graph, where nodes and edges are denoted by objects.
We refer to our model as Uncertainty-Aware Graph network for object DETection (UAGDet)
arXiv Detail & Related papers (2022-08-23T07:29:03Z) - Double-Dot Network for Antipodal Grasp Detection [20.21384585441404]
This paper proposes a new deep learning approach to antipodal grasp detection, named Double-Dot Network (DD-Net)
It follows the recent anchor-free object detection framework, which does not depend on empirically pre-set anchors.
An effective CNN architecture is introduced to localize such fingertips, and with the help of auxiliary centers for refinement, it accurately and robustly infers grasp candidates.
arXiv Detail & Related papers (2021-08-03T14:21:17Z) - Single Object Tracking through a Fast and Effective Single-Multiple
Model Convolutional Neural Network [0.0]
Recent state-of-the-art (SOTA) approaches are proposed based on taking a matching network with a heavy structure to distinguish the target from other objects in the area.
In this article, a special architecture is proposed based on which in contrast to the previous approaches, it is possible to identify the object location in a single shot.
The presented tracker performs comparatively with the SOTA in challenging situations while having a super speed compared to them (up to $120 FPS$ on 1080ti)
arXiv Detail & Related papers (2021-03-28T11:02:14Z) - Ensembling object detectors for image and video data analysis [98.26061123111647]
We propose a method for ensembling the outputs of multiple object detectors for improving detection performance and precision of bounding boxes on image data.
We extend it to video data by proposing a two-stage tracking-based scheme for detection refinement.
arXiv Detail & Related papers (2021-02-09T12:38:16Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z) - IterDet: Iterative Scheme for Object Detection in Crowded Environments [6.2997667081978825]
Deep learning-based detectors usually produce a redundant set of object bounding boxes.
These boxes are filtered using non-maximum suppression (NMS) in order to select exactly one bounding box per object of interest.
This greedy scheme is simple and provides sufficient accuracy for isolated objects but often fails in crowded environments.
In this work we develop an alternative iterative scheme, where a new subset of objects is detected at each iteration.
arXiv Detail & Related papers (2020-05-12T12:04:27Z) - Detective: An Attentive Recurrent Model for Sparse Object Detection [25.5804429439316]
Detective is an attentive object detector that identifies objects in images in a sequential manner.
Detective is a sparse object detector that generates a single bounding box per object instance.
We propose a training mechanism based on the Hungarian algorithm and a loss that balances the localization and classification tasks.
arXiv Detail & Related papers (2020-04-25T17:41:52Z) - BiDet: An Efficient Binarized Object Detector [96.19708396510894]
We propose a binarized neural network learning method called BiDet for efficient object detection.
Our BiDet fully utilizes the representational capacity of the binary neural networks for object detection by redundancy removal.
Our method outperforms the state-of-the-art binary neural networks by a sizable margin.
arXiv Detail & Related papers (2020-03-09T08:16:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.