Tracking-Assisted Object Detection with Event Cameras
- URL: http://arxiv.org/abs/2403.18330v3
- Date: Wed, 18 Sep 2024 04:54:28 GMT
- Title: Tracking-Assisted Object Detection with Event Cameras
- Authors: Ting-Kang Yen, Igor Morawski, Shusil Dangi, Kai He, Chung-Yi Lin, Jia-Fong Yeh, Hung-Ting Su, Winston Hsu,
- Abstract summary: Event-based object detection has recently garnered attention in the computer vision community.
However, feature asynchronism and sparsity cause invisible objects due to no relative motion to the camera.
In this paper, we consider those invisible objects as pseudo-occluded objects.
We exploit tracking strategies for pseudo-occluded objects to maintain their permanence and retain their bounding boxes.
- Score: 16.408606403997005
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Event-based object detection has recently garnered attention in the computer vision community due to the exceptional properties of event cameras, such as high dynamic range and no motion blur. However, feature asynchronism and sparsity cause invisible objects due to no relative motion to the camera, posing a significant challenge in the task. Prior works have studied various implicit-learned memories to retain as many temporal cues as possible. However, implicit memories still struggle to preserve long-term features effectively. In this paper, we consider those invisible objects as pseudo-occluded objects and aim to detect them by tracking through occlusions. Firstly, we introduce the visibility attribute of objects and contribute an auto-labeling algorithm to not only clean the existing event camera dataset but also append additional visibility labels to it. Secondly, we exploit tracking strategies for pseudo-occluded objects to maintain their permanence and retain their bounding boxes, even when features have not been available for a very long time. These strategies can be treated as an explicit-learned memory guided by the tracking objective to record the displacements of objects across frames. Lastly, we propose a spatio-temporal feature aggregation module to enrich the latent features and a consistency loss to increase the robustness of the overall pipeline. We conduct comprehensive experiments to verify our method's effectiveness where still objects are retained, but real occluded objects are discarded. The results demonstrate that (1) the additional visibility labels can assist in supervised training, and (2) our method outperforms state-of-the-art approaches with a significant improvement of 7.9% absolute mAP.
Related papers
- Leveraging Object Priors for Point Tracking [25.030407197192]
Point tracking is a fundamental problem in computer vision with numerous applications in AR and robotics.
We propose a novel objectness regularization approach that guides points to be aware of object priors.
Our approach achieves state-of-the-art performance on three point tracking benchmarks.
arXiv Detail & Related papers (2024-09-09T16:48:42Z) - Few-shot Oriented Object Detection with Memorable Contrastive Learning in Remote Sensing Images [11.217630579076237]
Few-shot object detection (FSOD) has garnered significant research attention in the field of remote sensing.
We propose a novel FSOD method for remote sensing images called Few-shot Oriented object detection with Memorable Contrastive learning (FOMC)
Specifically, we employ oriented bounding boxes instead of traditional horizontal bounding boxes to learn a better feature representation for arbitrary-oriented aerial objects.
arXiv Detail & Related papers (2024-03-20T08:15:18Z) - SeMoLi: What Moves Together Belongs Together [51.72754014130369]
We tackle semi-supervised object detection based on motion cues.
Recent results suggest that motion-based clustering methods can be used to pseudo-label instances of moving objects.
We re-think this approach and suggest that both, object detection, as well as motion-inspired pseudo-labeling, can be tackled in a data-driven manner.
arXiv Detail & Related papers (2024-02-29T18:54:53Z) - Object-centric Cross-modal Feature Distillation for Event-based Object
Detection [87.50272918262361]
RGB detectors still outperform event-based detectors due to sparsity of the event data and missing visual details.
We develop a novel knowledge distillation approach to shrink the performance gap between these two modalities.
We show that object-centric distillation allows to significantly improve the performance of the event-based student object detector.
arXiv Detail & Related papers (2023-11-09T16:33:08Z) - Object-Centric Multiple Object Tracking [124.30650395969126]
This paper proposes a video object-centric model for multiple-object tracking pipelines.
It consists of an index-merge module that adapts the object-centric slots into detection outputs and an object memory module.
Benefited from object-centric learning, we only require sparse detection labels for object localization and feature binding.
arXiv Detail & Related papers (2023-09-01T03:34:12Z) - Tackling Background Distraction in Video Object Segmentation [7.187425003801958]
A video object segmentation (VOS) aims to densely track certain objects in videos.
One of the main challenges in this task is the existence of background distractors that appear similar to the target objects.
We propose three novel strategies to suppress such distractors.
Our model achieves a comparable performance to contemporary state-of-the-art approaches, even with real-time performance.
arXiv Detail & Related papers (2022-07-14T14:25:19Z) - Object Permanence Emerges in a Random Walk along Memory [37.78331373391444]
We show that object permanence can emerge by optimizing for temporal coherence of memory.
This leads to a memory representation that stores occluded objects and predicts their motion, to better localize them.
The resulting model outperforms existing approaches on several datasets of increasing complexity and realism.
arXiv Detail & Related papers (2022-04-04T18:28:24Z) - Learning to Track Object Position through Occlusion [32.458623495840904]
Occlusion is one of the most significant challenges encountered by object detectors and trackers.
We propose a tracking-by-detection approach that builds upon the success of region based video object detectors.
Our approach achieves superior results on a dataset of furniture assembly videos collected from the internet.
arXiv Detail & Related papers (2021-06-20T22:29:46Z) - Learning to Track with Object Permanence [61.36492084090744]
We introduce an end-to-end trainable approach for joint object detection and tracking.
Our model, trained jointly on synthetic and real data, outperforms the state of the art on KITTI, and MOT17 datasets.
arXiv Detail & Related papers (2021-03-26T04:43:04Z) - Detecting Invisible People [58.49425715635312]
We re-purpose tracking benchmarks and propose new metrics for the task of detecting invisible objects.
We demonstrate that current detection and tracking systems perform dramatically worse on this task.
Second, we build dynamic models that explicitly reason in 3D, making use of observations produced by state-of-the-art monocular depth estimation networks.
arXiv Detail & Related papers (2020-12-15T16:54:45Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.