Multi-domain Collaborative Feature Representation for Robust Visual
Object Tracking
- URL: http://arxiv.org/abs/2108.04521v1
- Date: Tue, 10 Aug 2021 09:01:42 GMT
- Title: Multi-domain Collaborative Feature Representation for Robust Visual
Object Tracking
- Authors: Jiqing Zhang and Kai Zhao and Bo Dong and Yingkai Fu and Yuxin Wang
and Xin Yang and Baocai Yin
- Abstract summary: This paper focuses on effectively representing and utilizing complementary features from the frame domain and event domain.
For learning the unique features of the two domains, we utilize a Unique Extractor for Event (UEE) based on Spiking Neural Networks.
Experiments on standard RGB benchmark and real event tracking dataset demonstrate the effectiveness of the proposed approach.
- Score: 32.760681454334765
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Jointly exploiting multiple different yet complementary domain information
has been proven to be an effective way to perform robust object tracking. This
paper focuses on effectively representing and utilizing complementary features
from the frame domain and event domain for boosting object tracking performance
in challenge scenarios. Specifically, we propose Common Features Extractor
(CFE) to learn potential common representations from the RGB domain and event
domain. For learning the unique features of the two domains, we utilize a
Unique Extractor for Event (UEE) based on Spiking Neural Networks to extract
edge cues in the event domain which may be missed in RGB in some challenging
conditions, and a Unique Extractor for RGB (UER) based on Deep Convolutional
Neural Networks to extract texture and semantic information in RGB domain.
Extensive experiments on standard RGB benchmark and real event tracking dataset
demonstrate the effectiveness of the proposed approach. We show our approach
outperforms all compared state-of-the-art tracking algorithms and verify
event-based data is a powerful cue for tracking in challenging scenes.
Related papers
- How Important are Data Augmentations to Close the Domain Gap for Object Detection in Orbit? [15.550663626482903]
We investigate the efficacy of data augmentations to close the domain gap in spaceborne computer vision.
We propose two novel data augmentations specifically developed to emulate the visual effects observed in orbital imagery.
arXiv Detail & Related papers (2024-10-21T08:24:46Z) - Hierarchical Graph Interaction Transformer with Dynamic Token Clustering for Camouflaged Object Detection [57.883265488038134]
We propose a hierarchical graph interaction network termed HGINet for camouflaged object detection.
The network is capable of discovering imperceptible objects via effective graph interaction among the hierarchical tokenized features.
Our experiments demonstrate the superior performance of HGINet compared to existing state-of-the-art methods.
arXiv Detail & Related papers (2024-08-27T12:53:25Z) - TENet: Targetness Entanglement Incorporating with Multi-Scale Pooling and Mutually-Guided Fusion for RGB-E Object Tracking [30.89375068036783]
Existing approaches perform event feature extraction for RGB-E tracking using traditional appearance models.
We propose an Event backbone (Pooler) to obtain a high-quality feature representation that is cognisant of the intrinsic characteristics of the event data.
Our method significantly outperforms state-of-the-art trackers on two widely used RGB-E tracking datasets.
arXiv Detail & Related papers (2024-05-08T12:19:08Z) - Segment Any Events via Weighted Adaptation of Pivotal Tokens [85.39087004253163]
This paper focuses on the nuanced challenge of tailoring the Segment Anything Models (SAMs) for integration with event data.
We introduce a multi-scale feature distillation methodology to optimize the alignment of token embeddings originating from event data with their RGB image counterparts.
arXiv Detail & Related papers (2023-12-24T12:47:08Z) - Implicit Event-RGBD Neural SLAM [54.74363487009845]
Implicit neural SLAM has achieved remarkable progress recently.
Existing methods face significant challenges in non-ideal scenarios.
We propose EN-SLAM, the first event-RGBD implicit neural SLAM framework.
arXiv Detail & Related papers (2023-11-18T08:48:58Z) - SPADES: A Realistic Spacecraft Pose Estimation Dataset using Event
Sensing [9.583223655096077]
Due to limited access to real target datasets, algorithms are often trained using synthetic data and applied in the real domain.
Event sensing has been explored in the past and shown to reduce the domain gap between simulations and real-world scenarios.
We introduce a novel dataset, SPADES, comprising real event data acquired in a controlled laboratory environment and simulated event data using the same camera intrinsics.
arXiv Detail & Related papers (2023-11-09T12:14:47Z) - SpikeMOT: Event-based Multi-Object Tracking with Sparse Motion Features [52.213656737672935]
SpikeMOT is an event-based multi-object tracker.
SpikeMOT uses spiking neural networks to extract sparsetemporal features from event streams associated with objects.
arXiv Detail & Related papers (2023-09-29T05:13:43Z) - Revisiting Color-Event based Tracking: A Unified Network, Dataset, and
Metric [53.88188265943762]
We propose a single-stage backbone network for Color-Event Unified Tracking (CEUTrack), which achieves the above functions simultaneously.
Our proposed CEUTrack is simple, effective, and efficient, which achieves over 75 FPS and new SOTA performance.
arXiv Detail & Related papers (2022-11-20T16:01:31Z) - Specificity-preserving RGB-D Saliency Detection [103.3722116992476]
We propose a specificity-preserving network (SP-Net) for RGB-D saliency detection.
Two modality-specific networks and a shared learning network are adopted to generate individual and shared saliency maps.
Experiments on six benchmark datasets demonstrate that our SP-Net outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2021-08-18T14:14:22Z) - DA4Event: towards bridging the Sim-to-Real Gap for Event Cameras using
Domain Adaptation [22.804074390795734]
Event cameras capture pixel-level intensity changes in the form of "events"
The novelty of these sensors results in the lack of a large amount of training data capable of unlocking their potential.
We propose a novel architecture, which better exploits the peculiarities of frame-based event representations.
arXiv Detail & Related papers (2021-03-23T18:09:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.