Improving Object Detection, Multi-object Tracking, and Re-Identification
for Disaster Response Drones
- URL: http://arxiv.org/abs/2201.01494v1
- Date: Wed, 5 Jan 2022 07:56:58 GMT
- Title: Improving Object Detection, Multi-object Tracking, and Re-Identification
for Disaster Response Drones
- Authors: Chongkeun Paik, Hyunwoo J. Kim
- Abstract summary: We aim to detect and identify multiple objects using multiple cameras and computer vision for disaster response drones.
Two simple approaches are proposed to solve these issues.
One is a fast multi-camera system that added a tracklet association, and the other is incorporating a high-performance detector and tracker to resolve restrictions.
- Score: 11.84256047381657
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We aim to detect and identify multiple objects using multiple cameras and
computer vision for disaster response drones. The major challenges are taming
detection errors, resolving ID switching and fragmentation, adapting to
multi-scale features and multiple views with global camera motion. Two simple
approaches are proposed to solve these issues. One is a fast multi-camera
system that added a tracklet association, and the other is incorporating a
high-performance detector and tracker to resolve restrictions. (...) The
accuracy of our first approach (85.71%) is slightly improved compared to our
baseline, FairMOT (85.44%) in the validation dataset. In the final results
calculated based on L2-norm error, the baseline was 48.1, while the proposed
model combination was 34.9, which is a great reduction of error by a margin of
27.4%. In the second approach, although DeepSORT only processes a quarter of
all frames due to hardware and time limitations, our model with DeepSORT
(42.9%) outperforms FairMOT (71.4%) in terms of recall. Both of our models
ranked second and third place in the `AI Grand Challenge' organized by the
Korean Ministry of Science and ICT in 2020 and 2021, respectively. The source
codes are publicly available at these URLs
(github.com/mlvlab/drone_ai_challenge, github.com/mlvlab/Drone_Task1,
github.com/mlvlab/Rony2_task3, github.com/mlvlab/Drone_task4).
Related papers
- DroBoost: An Intelligent Score and Model Boosting Method for Drone Detection [1.2564343689544843]
Drone detection is a challenging object detection task where visibility conditions and quality of the images may be unfavorable.
Our work improves on the previous approach by combining several improvements.
The proposed technique won 1st Place in the Drone vs. Bird Challenge.
arXiv Detail & Related papers (2024-06-30T20:49:56Z) - Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems [13.225654514930595]
Multi-Resolution Rescored Byte-Track (MR2-ByteTrack) is a novel video object detection framework for ultra-low-power embedded processors.
MR2-ByteTrack reduces the average compute load of an off-the-shelf Deep Neural Network based object detector by up to 2.25$times$.
We demonstrate an average accuracy increase of 2.16% and a latency reduction of 43% on the GAP9 microcontroller.
arXiv Detail & Related papers (2024-04-17T15:45:49Z) - Single-Model and Any-Modality for Video Object Tracking [85.83753760853142]
We introduce Un-Track, a Unified Tracker of a single set of parameters for any modality.
To handle any modality, our method learns their common latent space through low-rank factorization and reconstruction techniques.
Our Un-Track achieves +8.1 absolute F-score gain, on the DepthTrack dataset, by introducing only +2.14 (over 21.50) GFLOPs with +6.6M (over 93M) parameters.
arXiv Detail & Related papers (2023-11-27T14:17:41Z) - Global Context Aggregation Network for Lightweight Saliency Detection of
Surface Defects [70.48554424894728]
We develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure.
First, we introduce a novel transformer encoder on the top layer of the lightweight backbone, which captures global context information through a novel Depth-wise Self-Attention (DSA) module.
The experimental results on three public defect datasets demonstrate that the proposed network achieves a better trade-off between accuracy and running efficiency compared with other 17 state-of-the-art methods.
arXiv Detail & Related papers (2023-09-22T06:19:11Z) - Improving CNN-based Person Re-identification using score Normalization [2.462953128215087]
This paper proposes a novel approach for PRe-ID, which combines a CNN based feature extraction method with Cross-view Quadratic Discriminant Analysis (XQDA) for metric learning.
The proposed approach is tested on four challenging datasets, including VIPeR, GRID, CUHK01, VIPeR and PRID450S.
arXiv Detail & Related papers (2023-07-01T18:12:27Z) - ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every
Detection Box [81.45219802386444]
Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects across video frames.
We propose a hierarchical data association strategy to mine the true objects in low-score detection boxes.
In 3D scenarios, it is much easier for the tracker to predict object velocities in the world coordinate.
arXiv Detail & Related papers (2023-03-27T15:35:21Z) - Improving Domain Generalization by Learning without Forgetting:
Application in Retail Checkout [0.0]
This paper addresses the problem by proposing a method with a two-stage pipeline.
The first stage detects class-agnostic items, and the second one is dedicated to classify product categories.
The method is evaluated on the AI City challenge 2022 -- Track 4 and gets the F1 score $40%$ on the test A set.
arXiv Detail & Related papers (2022-07-12T09:35:28Z) - CoDeNet: Efficient Deployment of Input-Adaptive Object Detection on
Embedded FPGAs [41.43273142203345]
We harness the flexibility of FPGAs to develop a novel object detection pipeline with deformable convolutions.
With our high-efficiency implementation, our solution reaches 26.9 frames per second with a tiny model size of 0.76 MB.
Our model gets to 67.1 AP50 on Pascal VOC with only 2.9 MB of parameters-20.9x smaller but 10% more accurate than Tiny-YOLO.
arXiv Detail & Related papers (2020-06-12T17:56:47Z) - Empirical Upper Bound, Error Diagnosis and Invariance Analysis of Modern
Object Detectors [47.64219291655723]
We employ 2 state-of-the-art object detection benchmarks, and analyze more than 15 models over 4 large scale datasets.
We find that models generate a lot of boxes on empty regions and that context is more important for detecting small objects than larger ones.
arXiv Detail & Related papers (2020-04-05T06:19:43Z) - FairMOT: On the Fairness of Detection and Re-Identification in Multiple
Object Tracking [92.48078680697311]
Multi-object tracking (MOT) is an important problem in computer vision.
We present a simple yet effective approach termed as FairMOT based on the anchor-free object detection architecture CenterNet.
The approach achieves high accuracy for both detection and tracking.
arXiv Detail & Related papers (2020-04-04T08:18:00Z) - Detection in Crowded Scenes: One Proposal, Multiple Predictions [79.28850977968833]
We propose a proposal-based object detector, aiming at detecting highly-overlapped instances in crowded scenes.
The key of our approach is to let each proposal predict a set of correlated instances rather than a single one in previous proposal-based frameworks.
Our detector can obtain 4.9% AP gains on challenging CrowdHuman dataset and 1.0% $textMR-2$ improvements on CityPersons dataset.
arXiv Detail & Related papers (2020-03-20T09:48:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.