Fast, Accurate Barcode Detection in Ultra High-Resolution Images
- URL: http://arxiv.org/abs/2102.06868v1
- Date: Sat, 13 Feb 2021 05:59:59 GMT
- Title: Fast, Accurate Barcode Detection in Ultra High-Resolution Images
- Authors: Jerome Quenum, Kehan Wang, Avideh Zakhor
- Abstract summary: We propose using semantic segmentation to achieve a fast and accurate detection of barcodes in UHR images.
The end-to-end system has a latency of 16 milliseconds, which is $2.5times$ faster than YOLOv4 and $5.9times$ faster than Mask RCNN.
In terms of accuracy, our method outperforms YOLOv4 and Mask R-CNN by a $mAP$ of 5.5% and 47.1% respectively.
- Score: 1.160208922584163
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Object detection in Ultra High-Resolution (UHR) images has long been a
challenging problem in computer vision due to the varying scales of the
targeted objects. When it comes to barcode detection, resizing UHR input images
to smaller sizes often leads to the loss of pertinent information, while
processing them directly is highly inefficient and computationally expensive.
In this paper, we propose using semantic segmentation to achieve a fast and
accurate detection of barcodes of various scales in UHR images. Our pipeline
involves a modified Region Proposal Network (RPN) on images of size greater
than 10k$\times$10k and a newly proposed Y-Net segmentation network, followed
by a post-processing workflow for fitting a bounding box around each segmented
barcode mask. The end-to-end system has a latency of 16 milliseconds, which is
$2.5\times$ faster than YOLOv4 and $5.9\times$ faster than Mask RCNN. In terms
of accuracy, our method outperforms YOLOv4 and Mask R-CNN by a $mAP$ of 5.5%
and 47.1% respectively, on a synthetic dataset. We have made available the
generated synthetic barcode dataset and its code at
http://www.github.com/viplab/BSBD/.
Related papers
- DDU-Net: A Domain Decomposition-based CNN for High-Resolution Image Segmentation on Multiple GPUs [46.873264197900916]
A domain decomposition-based U-Net architecture is introduced, which partitions input images into non-overlapping patches.
A communication network is added to facilitate inter-patch information exchange to enhance the understanding of spatial context.
Results show that the approach achieves a $2-3,%$ higher intersection over union (IoU) score compared to the same network without inter-patch communication.
arXiv Detail & Related papers (2024-07-31T01:07:21Z) - Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems [13.225654514930595]
Multi-Resolution Rescored Byte-Track (MR2-ByteTrack) is a novel video object detection framework for ultra-low-power embedded processors.
MR2-ByteTrack reduces the average compute load of an off-the-shelf Deep Neural Network based object detector by up to 2.25$times$.
We demonstrate an average accuracy increase of 2.16% and a latency reduction of 43% on the GAP9 microcontroller.
arXiv Detail & Related papers (2024-04-17T15:45:49Z) - Rapid-INR: Storage Efficient CPU-free DNN Training Using Implicit Neural Representation [7.539498729072623]
Implicit Neural Representation (INR) is an innovative approach for representing complex shapes or objects without explicitly defining their geometry or surface structure.
Previous research has demonstrated the effectiveness of using neural networks as INR for image compression, showcasing comparable performance to traditional methods such as JPEG.
This paper introduces Rapid-INR, a novel approach that utilizes INR for encoding and compressing images, thereby accelerating neural network training in computer vision tasks.
arXiv Detail & Related papers (2023-06-29T05:49:07Z) - Building Flyweight FLIM-based CNNs with Adaptive Decoding for Object
Detection [40.97322222472642]
This work presents a method to build a Convolutional Neural Network (CNN) layer by layer for object detection from user-drawn markers.
We address the detection of Schistosomiasis mansoni eggs in microscopy images of fecal samples, and the detection of ships in satellite images.
Our CNN weighs thousands of times less than SOTA object detectors, being suitable for CPU execution and showing superior or equivalent performance to three methods in five measures.
arXiv Detail & Related papers (2023-06-26T16:48:20Z) - {\mu}Split: efficient image decomposition for microscopy data [50.794670705085835]
muSplit is a dedicated approach for trained image decomposition in the context of fluorescence microscopy images.
We introduce lateral contextualization (LC), a novel meta-architecture that enables the memory efficient incorporation of large image-context.
We apply muSplit to five decomposition tasks, one on a synthetic dataset, four others derived from real microscopy data.
arXiv Detail & Related papers (2022-11-23T11:26:24Z) - SuperYOLO: Super Resolution Assisted Object Detection in Multimodal
Remote Sensing Imagery [36.216230299131404]
We propose SuperYOLO, which fuses multimodal data and performs high-resolution (HR) object detection on multiscale objects.
Our proposed model shows a favorable accuracy and speed tradeoff compared to the state-of-the-art models.
arXiv Detail & Related papers (2022-09-27T12:58:58Z) - Small Lesion Segmentation in Brain MRIs with Subpixel Embedding [105.1223735549524]
We present a method to segment MRI scans of the human brain into ischemic stroke lesion and normal tissues.
We propose a neural network architecture in the form of a standard encoder-decoder where predictions are guided by a spatial expansion embedding network.
arXiv Detail & Related papers (2021-09-18T00:21:17Z) - CNNs for JPEGs: A Study in Computational Cost [49.97673761305336]
Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade.
CNNs are capable of learning robust representations of the data directly from the RGB pixels.
Deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years.
arXiv Detail & Related papers (2020-12-26T15:00:10Z) - Real-Time Resource Allocation for Tracking Systems [54.802447204921634]
We propose a new algorithm called emphPartiMax that greatly reduces this cost by applying the person detector only to the relevant parts of the image.
PartiMax exploits information in the particle filter to select $k$ of the $n$ candidate emphpixel boxes in the image.
We show that our system runs in real-time by processing only 10% of the pixel boxes in the image while still retaining 80% of the original tracking performance achieved when processing all pixel boxes.
arXiv Detail & Related papers (2020-09-21T08:29:05Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z) - R-FCN: Object Detection via Region-based Fully Convolutional Networks [87.62557357527861]
We present region-based, fully convolutional networks for accurate and efficient object detection.
Our result is achieved at a test-time speed of 170ms per image, 2.5-20x faster than the Faster R-CNN counterpart.
arXiv Detail & Related papers (2016-05-20T15:50:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.