Patch-based Selection and Refinement for Early Object Detection
- URL: http://arxiv.org/abs/2311.02274v1
- Date: Fri, 3 Nov 2023 23:41:13 GMT
- Title: Patch-based Selection and Refinement for Early Object Detection
- Authors: Tianyi Zhang, Kishore Kasichainula, Yaoxin Zhuo, Baoxin Li, Jae-Sun
Seo, Yu Cao
- Abstract summary: We propose a novel set of algorithms that divide the image into patches, select patches with objects at various scales, elaborate the details of a small object, and detect it as early as possible.
Our approach is built upon a transformer-based network and integrates the diffusion model to improve the detection accuracy.
Our algorithms enhance the mAP for small objects from 1.03 to 8.93, and reduce the data volume in computation by more than 77%.
- Score: 20.838511460733038
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Early object detection (OD) is a crucial task for the safety of many dynamic
systems. Current OD algorithms have limited success for small objects at a long
distance. To improve the accuracy and efficiency of such a task, we propose a
novel set of algorithms that divide the image into patches, select patches with
objects at various scales, elaborate the details of a small object, and detect
it as early as possible. Our approach is built upon a transformer-based network
and integrates the diffusion model to improve the detection accuracy. As
demonstrated on BDD100K, our algorithms enhance the mAP for small objects from
1.03 to 8.93, and reduce the data volume in computation by more than 77\%. The
source code is available at
\href{https://github.com/destiny301/dpr}{https://github.com/destiny301/dpr}
Related papers
- ESOD: Efficient Small Object Detection on High-Resolution Images [36.80623357577051]
Small objects are usually sparsely distributed and locally clustered.
Massive feature extraction computations are wasted on the non-target background area of images.
We propose to reuse the detector's backbone to conduct feature-level object-seeking and patch-slicing.
arXiv Detail & Related papers (2024-07-23T12:21:23Z) - Learning to Make Keypoints Sub-Pixel Accurate [80.55676599677824]
This work addresses the challenge of sub-pixel accuracy in detecting 2D local features.
We propose a novel network that enhances any detector with sub-pixel precision by learning an offset vector for detected features.
arXiv Detail & Related papers (2024-07-16T12:39:56Z) - SpirDet: Towards Efficient, Accurate and Lightweight Infrared Small
Target Detector [60.42293239557962]
We propose SpirDet, a novel approach for efficient detection of infrared small targets.
We employ a new dual-branch sparse decoder to restore the feature map.
Extensive experiments show that the proposed SpirDet significantly outperforms state-of-the-art models.
arXiv Detail & Related papers (2024-02-08T05:06:14Z) - Small Object Detection by DETR via Information Augmentation and Adaptive
Feature Fusion [4.9860018132769985]
The RT-DETR model performs well in real-time object detection, but performs poorly in small object detection accuracy.
We propose an adaptive feature fusion algorithm that assigns learnable parameters to each feature map from different levels.
This enhances the model's ability to capture object features at different scales, thereby improving the accuracy of detecting small objects.
arXiv Detail & Related papers (2024-01-16T00:01:23Z) - SalienDet: A Saliency-based Feature Enhancement Algorithm for Object
Detection for Autonomous Driving [160.57870373052577]
We propose a saliency-based OD algorithm (SalienDet) to detect unknown objects.
Our SalienDet utilizes a saliency-based algorithm to enhance image features for object proposal generation.
We design a dataset relabeling approach to differentiate the unknown objects from all objects in training sample set to achieve Open-World Detection.
arXiv Detail & Related papers (2023-05-11T16:19:44Z) - Fewer is More: Efficient Object Detection in Large Aerial Images [59.683235514193505]
This paper presents an Objectness Activation Network (OAN) to help detectors focus on fewer patches but achieve more efficient inference and more accurate results.
Using OAN, all five detectors acquire more than 30.0% speed-up on three large-scale aerial image datasets.
We extend our OAN to driving-scene object detection and 4K video object detection, boosting the detection speed by 112.1% and 75.0%, respectively.
arXiv Detail & Related papers (2022-12-26T12:49:47Z) - Embracing Single Stride 3D Object Detector with Sparse Transformer [63.179720817019096]
In LiDAR-based 3D object detection for autonomous driving, the ratio of the object size to input scene size is significantly smaller compared to 2D detection cases.
Many 3D detectors directly follow the common practice of 2D detectors, which downsample the feature maps even after quantizing the point clouds.
We propose Single-stride Sparse Transformer (SST) to maintain the original resolution from the beginning to the end of the network.
arXiv Detail & Related papers (2021-12-13T02:12:02Z) - Small Object Detection Based on Modified FSSD and Model Compression [7.387639662781843]
This paper proposes a small object detection algorithm based on FSSD.
In order to reduce the computational cost and storage space, pruning is carried out to achieve model compression.
The average accuracy (mAP) of the algorithm can reach 80.4% on PASCAL VOC and the speed is 59.5 FPS on GTX1080ti.
arXiv Detail & Related papers (2021-08-24T03:20:32Z) - Multi-patch Feature Pyramid Network for Weakly Supervised Object
Detection in Optical Remote Sensing Images [39.25541709228373]
We propose a new architecture for object detection with a multiple patch feature pyramid network (MPFP-Net)
MPFP-Net is different from the current models that during training only pursue the most discriminative patches.
We introduce an effective method to regularize the residual values and make the fusion transition layers strictly norm-preserving.
arXiv Detail & Related papers (2021-08-18T09:25:39Z) - You Better Look Twice: a new perspective for designing accurate
detectors with reduced computations [56.34005280792013]
BLT-net is a new low-computation two-stage object detection architecture.
It reduces computations by separating objects from background using a very lite first-stage.
Resulting image proposals are then processed in the second-stage by a highly accurate model.
arXiv Detail & Related papers (2021-07-21T12:39:51Z) - AmphibianDetector: adaptive computation for moving objects detection [0.913755431537592]
We propose an approach to object detection which makes it possible to reduce the number of false-positive detections.
The proposed approach is a modification of CNN already trained for object detection task.
The efficiency of the proposed approach was demonstrated on the open dataset "CDNet2014 pedestrian"
arXiv Detail & Related papers (2020-11-15T12:37:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.