Finding a Needle in a Haystack: Tiny Flying Object Detection in 4K
Videos using a Joint Detection-and-Tracking Approach
- URL: http://arxiv.org/abs/2105.08253v1
- Date: Tue, 18 May 2021 03:22:03 GMT
- Title: Finding a Needle in a Haystack: Tiny Flying Object Detection in 4K
Videos using a Joint Detection-and-Tracking Approach
- Authors: Ryota Yoshihashi, Rei Kawakami, Shaodi You, Tu Tuan Trinh, Makoto
Iida, Takeshi Naemura
- Abstract summary: We present a neural network model called the Recurrent Correlational Network, where detection and tracking are jointly performed.
In experiments with datasets containing images of scenes with small flying objects, such as birds and unmanned aerial vehicles, the proposed method yielded consistent improvements.
Our network performs as well as state-of-the-art generic object trackers when it was evaluated as a tracker on a bird image dataset.
- Score: 19.59528430884104
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Detecting tiny objects in a high-resolution video is challenging because the
visual information is little and unreliable. Specifically, the challenge
includes very low resolution of the objects, MPEG artifacts due to compression
and a large searching area with many hard negatives. Tracking is equally
difficult because of the unreliable appearance, and the unreliable motion
estimation. Luckily, we found that by combining this two challenging tasks
together, there will be mutual benefits. Following the idea, in this paper, we
present a neural network model called the Recurrent Correlational Network,
where detection and tracking are jointly performed over a multi-frame
representation learned through a single, trainable, and end-to-end network. The
framework exploits a convolutional long short-term memory network for learning
informative appearance changes for detection, while the learned representation
is shared in tracking for enhancing its performance. In experiments with
datasets containing images of scenes with small flying objects, such as birds
and unmanned aerial vehicles, the proposed method yielded consistent
improvements in detection performance over deep single-frame detectors and
existing motion-based detectors. Furthermore, our network performs as well as
state-of-the-art generic object trackers when it was evaluated as a tracker on
a bird image dataset.
Related papers
- Visible and Clear: Finding Tiny Objects in Difference Map [50.54061010335082]
We introduce a self-reconstruction mechanism in the detection model, and discover the strong correlation between it and the tiny objects.
Specifically, we impose a reconstruction head in-between the neck of a detector, constructing a difference map of the reconstructed image and the input, which shows high sensitivity to tiny objects.
We further develop a Difference Map Guided Feature Enhancement (DGFE) module to make the tiny feature representation more clear.
arXiv Detail & Related papers (2024-05-18T12:22:26Z) - Lifting Multi-View Detection and Tracking to the Bird's Eye View [5.679775668038154]
Recent advancements in multi-view detection and 3D object recognition have significantly improved performance.
We compare modern lifting methods, both parameter-free and parameterized, to multi-view aggregation.
We present an architecture that aggregates the features of multiple times steps to learn robust detection.
arXiv Detail & Related papers (2024-03-19T09:33:07Z) - Single-Shot and Multi-Shot Feature Learning for Multi-Object Tracking [55.13878429987136]
We propose a simple yet effective two-stage feature learning paradigm to jointly learn single-shot and multi-shot features for different targets.
Our method has achieved significant improvements on MOT17 and MOT20 datasets while reaching state-of-the-art performance on DanceTrack dataset.
arXiv Detail & Related papers (2023-11-17T08:17:49Z) - ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every
Detection Box [81.45219802386444]
Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects across video frames.
We propose a hierarchical data association strategy to mine the true objects in low-score detection boxes.
In 3D scenarios, it is much easier for the tracker to predict object velocities in the world coordinate.
arXiv Detail & Related papers (2023-03-27T15:35:21Z) - Minkowski Tracker: A Sparse Spatio-Temporal R-CNN for Joint Object
Detection and Tracking [53.64390261936975]
We present Minkowski Tracker, a sparse-temporal R-CNN that jointly solves object detection and tracking problems.
Inspired by region-based CNN (R-CNN), we propose to track motion as a second stage of the object detector R-CNN.
We show in large-scale experiments that the overall performance gain of our method is due to four factors.
arXiv Detail & Related papers (2022-08-22T04:47:40Z) - Video Salient Object Detection via Contrastive Features and Attention
Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection.
A co-attention formulation is utilized to combine the low-level and high-level features.
We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z) - Improved detection of small objects in road network sequences [0.0]
We propose a new procedure for detecting small-scale objects by applying super-resolution processes based on detections performed by convolutional neural networks.
This work has been tested for a set of traffic images containing elements of different scales to test the efficiency according to the detections obtained by the model.
arXiv Detail & Related papers (2021-05-18T10:13:23Z) - Few-Shot Learning for Video Object Detection in a Transfer-Learning
Scheme [70.45901040613015]
We study the new problem of few-shot learning for video object detection.
We employ a transfer-learning framework to effectively train the video object detector on a large number of base-class objects and a few video clips of novel-class objects.
arXiv Detail & Related papers (2021-03-26T20:37:55Z) - Concurrent Segmentation and Object Detection CNNs for Aircraft Detection
and Identification in Satellite Images [0.0]
We present a dedicated method to detect and identify aircraft, combining two very different convolutional neural networks (CNNs)
The results we present show that this combination outperforms significantly each unitary model, reducing drastically the false negative rate.
arXiv Detail & Related papers (2020-05-27T07:35:55Z) - Joint Detection and Tracking in Videos with Identification Features [36.55599286568541]
We propose the first joint optimization of detection, tracking and re-identification features for videos.
Our method reaches the state-of-the-art on MOT, it ranks 1st in the UA-DETRAC'18 tracking challenge among online trackers, and 3rd overall.
arXiv Detail & Related papers (2020-05-21T21:06:40Z) - Plug & Play Convolutional Regression Tracker for Video Object Detection [37.47222104272429]
Video object detection targets to simultaneously localize the bounding boxes of the objects and identify their classes in a given video.
One challenge for video object detection is to consistently detect all objects across the whole video.
We propose a Plug & Play scale-adaptive convolutional regression tracker for the video object detection task.
arXiv Detail & Related papers (2020-03-02T15:57:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.