QuickBrowser: A Unified Model to Detect and Read Simple Object in
Real-time
- URL: http://arxiv.org/abs/2102.07354v1
- Date: Mon, 15 Feb 2021 05:47:40 GMT
- Title: QuickBrowser: A Unified Model to Detect and Read Simple Object in
Real-time
- Authors: Thao Do and Daeyoung Kim
- Abstract summary: This work aims to solve this detect-and-read problem in a lightweight way by integrating multi-digit recognition into a one-stage object detection model.
Our choice of backbones and modifications in architecture, loss function, data augmentation and training make the method robust, efficient and speedy.
- Score: 3.098115480186737
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: There are many real-life use cases such as barcode scanning or billboard
reading where people need to detect objects and read the object contents.
Commonly existing methods are first trying to localize object regions, then
determine layout and lastly classify content units. However, for simple fixed
structured objects like license plates, this approach becomes overkill and
lengthy to run. This work aims to solve this detect-and-read problem in a
lightweight way by integrating multi-digit recognition into a one-stage object
detection model. Our unified method not only eliminates the duplication in
feature extraction (one for localizing, one again for classifying) but also
provides useful contextual information around object regions for
classification. Additionally, our choice of backbones and modifications in
architecture, loss function, data augmentation and training make the method
robust, efficient and speedy. Secondly, we made a public benchmark dataset of
diverse real-life 1D barcodes for a reliable evaluation, which we collected,
annotated and checked carefully. Eventually, experimental results prove the
method's efficiency on the barcode problem by outperforming industrial tools in
both detecting and decoding rates with a real-time fps at a VGA-similar
resolution. It also did a great job expectedly on the license-plate recognition
task (on the AOLP dataset) by outperforming the current state-of-the-art method
significantly in terms of recognition rate and inference time.
Related papers
- Bayesian Detector Combination for Object Detection with Crowdsourced Annotations [49.43709660948812]
Acquiring fine-grained object detection annotations in unconstrained images is time-consuming, expensive, and prone to noise.
We propose a novel Bayesian Detector Combination (BDC) framework to more effectively train object detectors with noisy crowdsourced annotations.
BDC is model-agnostic, requires no prior knowledge of the annotators' skill level, and seamlessly integrates with existing object detection models.
arXiv Detail & Related papers (2024-07-10T18:00:54Z) - Improving Online Lane Graph Extraction by Object-Lane Clustering [106.71926896061686]
We propose an architecture and loss formulation to improve the accuracy of local lane graph estimates.
The proposed method learns to assign the objects to centerlines by considering the centerlines as cluster centers.
We show that our method can achieve significant performance improvements by using the outputs of existing 3D object detection methods.
arXiv Detail & Related papers (2023-07-20T15:21:28Z) - DOAD: Decoupled One Stage Action Detection Network [77.14883592642782]
Localizing people and recognizing their actions from videos is a challenging task towards high-level video understanding.
Existing methods are mostly two-stage based, with one stage for person bounding box generation and the other stage for action recognition.
We present a decoupled one-stage network dubbed DOAD, to improve the efficiency for-temporal action detection.
arXiv Detail & Related papers (2023-04-01T08:06:43Z) - Fast and Accurate Object Detection on Asymmetrical Receptive Field [0.0]
This article proposes methods for improving object detection accuracy from the perspective of changing receptive fields.
The structure of the head part of YOLOv5 is modified by adding asymmetrical pooling layers.
The performances of the new model in this article are compared with original YOLOv5 model and analyzed from several parameters.
arXiv Detail & Related papers (2023-03-15T23:59:18Z) - One-Shot General Object Localization [43.88712478006662]
OneLoc is a general one-shot object localization algorithm.
OneLoc efficiently finds the object center and bounding box size by a special voting scheme.
Experiments show that the proposed method achieves state-of-the-art overall performance on two datasets.
arXiv Detail & Related papers (2022-11-24T03:14:04Z) - Seeing BDD100K in dark: Single-Stage Night-time Object Detection via
Continual Fourier Contrastive Learning [3.4012007729454816]
Night-time object detection has been studied only sparsely, that too, via non-uniform evaluation protocols among the limited available papers.
In this paper, we bridge these 3 gaps:.
Lack of an uniform evaluation protocol (using a single-stage detector, due to its efficacy, and efficiency);.
A choice of dataset for benchmarking night-time object detection, and.
A novel method to address the limitations of current alternatives.
arXiv Detail & Related papers (2021-12-06T09:28:45Z) - Plug-and-Play Few-shot Object Detection with Meta Strategy and Explicit
Localization Inference [78.41932738265345]
This paper proposes a plug detector that can accurately detect the objects of novel categories without fine-tuning process.
We introduce two explicit inferences into the localization process to reduce its dependence on annotated data.
It shows a significant lead in both efficiency, precision, and recall under varied evaluation protocols.
arXiv Detail & Related papers (2021-10-26T03:09:57Z) - Multi-patch Feature Pyramid Network for Weakly Supervised Object
Detection in Optical Remote Sensing Images [39.25541709228373]
We propose a new architecture for object detection with a multiple patch feature pyramid network (MPFP-Net)
MPFP-Net is different from the current models that during training only pursue the most discriminative patches.
We introduce an effective method to regularize the residual values and make the fusion transition layers strictly norm-preserving.
arXiv Detail & Related papers (2021-08-18T09:25:39Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z) - Towards End-to-end Car License Plate Location and Recognition in
Unconstrained Scenarios [0.0]
We present an efficient framework to solve the license plate detection and recognition tasks simultaneously.
It is a lightweight and unified deep neural network, that can be optimized end-to-end and work in real-time.
Experimental results indicate that the proposed method significantly outperforms the previous state-of-the-art methods in both speed and precision.
arXiv Detail & Related papers (2020-08-25T09:51:33Z) - A Self-Training Approach for Point-Supervised Object Detection and
Counting in Crowds [54.73161039445703]
We propose a novel self-training approach that enables a typical object detector trained only with point-level annotations.
During training, we utilize the available point annotations to supervise the estimation of the center points of objects.
Experimental results show that our approach significantly outperforms state-of-the-art point-supervised methods under both detection and counting tasks.
arXiv Detail & Related papers (2020-07-25T02:14:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.