HHTrack: Hyperspectral Object Tracking Using Hybrid Attention
- URL: http://arxiv.org/abs/2308.07016v2
- Date: Wed, 30 Aug 2023 07:01:42 GMT
- Title: HHTrack: Hyperspectral Object Tracking Using Hybrid Attention
- Authors: Yuedong Tan
- Abstract summary: We propose a hyperspectral object tracker based on hybrid attention (HHTrack)
The core of HHTrack is a hyperspectral hybrid attention (HHA) module that unifies feature extraction and fusion within one component through token interactions.
A hyperspectral bands fusion (HBF) module is also introduced to selectively aggregate spatial and spectral signatures from the full hyperspectral input.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hyperspectral imagery provides abundant spectral information beyond the
visible RGB bands, offering rich discriminative details about objects in a
scene. Leveraging such data has the potential to enhance visual tracking
performance. In this paper, we propose a hyperspectral object tracker based on
hybrid attention (HHTrack). The core of HHTrack is a hyperspectral hybrid
attention (HHA) module that unifies feature extraction and fusion within one
component through token interactions. A hyperspectral bands fusion (HBF) module
is also introduced to selectively aggregate spatial and spectral signatures
from the full hyperspectral input. Extensive experiments demonstrate the
state-of-the-art performance of HHTrack on benchmark Near Infrared (NIR), Red
Near Infrared (Red-NIR), and Visible (VIS) hyperspectral tracking datasets. Our
work provides new insights into harnessing the strengths of transformers and
hyperspectral fusion to advance robust object tracking.
Related papers
- BihoT: A Large-Scale Dataset and Benchmark for Hyperspectral Camouflaged Object Tracking [22.533682363532403]
We provide a new task called hyperspectral camouflaged object tracking (HCOT)
We meticulously construct a large-scale HCOT dataset, termed BihoT, which consists of 41,912 hyperspectral images covering 49 video sequences.
A simple but effective baseline model, named spectral prompt-based distractor-aware network (SPDAN), is proposed.
arXiv Detail & Related papers (2024-08-22T09:07:51Z) - HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model [88.13261547704444]
Hyper SIGMA is a vision transformer-based foundation model for HSI interpretation.
It integrates spatial and spectral features using a specially designed spectral enhancement module.
It shows significant advantages in scalability, robustness, cross-modal transferring capability, and real-world applicability.
arXiv Detail & Related papers (2024-06-17T13:22:58Z) - SSF-Net: Spatial-Spectral Fusion Network with Spectral Angle Awareness
for Hyperspectral Object Tracking [21.664141982246598]
Hyperspectral video (HSV) offers valuable spatial, spectral, and temporal information simultaneously.
Existing methods primarily focus on band regrouping and rely on RGB trackers for feature extraction.
In this paper, a spatial-spectral fusion network with spectral angle awareness (SST-Net) is proposed for hyperspectral (HS) object tracking.
arXiv Detail & Related papers (2024-03-09T09:37:13Z) - Spectrum-driven Mixed-frequency Network for Hyperspectral Salient Object
Detection [14.621504062838731]
We propose a novel approach that fully leverages the spectral characteristics by extracting two distinct frequency components from the spectrum.
The Spectral Saliency approximates the region of salient objects, while the Spectral Edge captures edge information of salient objects.
To effectively utilize this dual-frequency information, we introduce a novel lightweight Spectrum-driven Mixed-frequency Network (SMN)
arXiv Detail & Related papers (2023-12-02T08:05:45Z) - Hy-Tracker: A Novel Framework for Enhancing Efficiency and Accuracy of
Object Tracking in Hyperspectral Videos [19.733925664613093]
We propose a novel framework called Hy-Tracker to bridge the gap between hyperspectral data and state-of-the-art object detection methods.
The framework incorporates a refined tracking module on top of YOLOv7.
The experimental results on hyperspectral benchmark datasets demonstrate the effectiveness of Hy-Tracker.
arXiv Detail & Related papers (2023-11-30T02:38:45Z) - SpikeMOT: Event-based Multi-Object Tracking with Sparse Motion Features [52.213656737672935]
SpikeMOT is an event-based multi-object tracker.
SpikeMOT uses spiking neural networks to extract sparsetemporal features from event streams associated with objects.
arXiv Detail & Related papers (2023-09-29T05:13:43Z) - Object Detection in Hyperspectral Image via Unified Spectral-Spatial
Feature Aggregation [55.9217962930169]
We present S2ADet, an object detector that harnesses the rich spectral and spatial complementary information inherent in hyperspectral images.
S2ADet surpasses existing state-of-the-art methods, achieving robust and reliable results.
arXiv Detail & Related papers (2023-06-14T09:01:50Z) - Learning Dual-Fused Modality-Aware Representations for RGBD Tracking [67.14537242378988]
Compared with the traditional RGB object tracking, the addition of the depth modality can effectively solve the target and background interference.
Some existing RGBD trackers use the two modalities separately and thus some particularly useful shared information between them is ignored.
We propose a novel Dual-fused Modality-aware Tracker (termed DMTracker) which aims to learn informative and discriminative representations of the target objects for robust RGBD tracking.
arXiv Detail & Related papers (2022-11-06T07:59:07Z) - Visible-Thermal UAV Tracking: A Large-Scale Benchmark and New Baseline [80.13652104204691]
In this paper, we construct a large-scale benchmark with high diversity for visible-thermal UAV tracking (VTUAV)
We provide a coarse-to-fine attribute annotation, where frame-level attributes are provided to exploit the potential of challenge-specific trackers.
In addition, we design a new RGB-T baseline, named Hierarchical Multi-modal Fusion Tracker (HMFT), which fuses RGB-T data in various levels.
arXiv Detail & Related papers (2022-04-08T15:22:33Z) - DS-Net: Dynamic Spatiotemporal Network for Video Salient Object
Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information.
We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.