Transparent Object Tracking with Enhanced Fusion Module
- URL: http://arxiv.org/abs/2309.06701v1
- Date: Wed, 13 Sep 2023 03:52:09 GMT
- Title: Transparent Object Tracking with Enhanced Fusion Module
- Authors: Kalyan Garigapati, Erik Blasch, Jie Wei, Haibin Ling
- Abstract summary: We propose a new tracker architecture that uses our fusion techniques to achieve superior results for transparent object tracking.
Our results and the implementation of code will be made publicly available at https://github.com/kalyan05TOTEM.
- Score: 56.403878717170784
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate tracking of transparent objects, such as glasses, plays a critical
role in many robotic tasks such as robot-assisted living. Due to the adaptive
and often reflective texture of such objects, traditional tracking algorithms
that rely on general-purpose learned features suffer from reduced performance.
Recent research has proposed to instill transparency awareness into existing
general object trackers by fusing purpose-built features. However, with the
existing fusion techniques, the addition of new features causes a change in the
latent space making it impossible to incorporate transparency awareness on
trackers with fixed latent spaces. For example, many of the current days
transformer-based trackers are fully pre-trained and are sensitive to any
latent space perturbations. In this paper, we present a new feature fusion
technique that integrates transparency information into a fixed feature space,
enabling its use in a broader range of trackers. Our proposed fusion module,
composed of a transformer encoder and an MLP module, leverages key query-based
transformations to embed the transparency information into the tracking
pipeline. We also present a new two-step training strategy for our fusion
module to effectively merge transparency features. We propose a new tracker
architecture that uses our fusion techniques to achieve superior results for
transparent object tracking. Our proposed method achieves competitive results
with state-of-the-art trackers on TOTB, which is the largest transparent object
tracking benchmark recently released. Our results and the implementation of
code will be made publicly available at https://github.com/kalyan0510/TOTEM.
Related papers
- A New Dataset and a Distractor-Aware Architecture for Transparent Object
Tracking [34.08943612955157]
Performance of modern trackers degrades substantially on transparent objects compared to opaque objects.
We propose the first transparent object tracking training dataset Trans2k that consists of over 2k sequences with 104,343 images overall.
We also present a new distractor-aware transparent object tracker (DiTra) that treats localization accuracy and target identification as separate tasks.
arXiv Detail & Related papers (2024-01-08T13:04:28Z) - TransTouch: Learning Transparent Objects Depth Sensing Through Sparse
Touches [23.87056600709768]
We propose a method to finetune a stereo network with sparse depth labels automatically collected using a probing system with tactile feedback.
We show that our method can significantly improve real-world depth sensing accuracy, especially for transparent objects.
arXiv Detail & Related papers (2023-09-18T01:55:17Z) - Trans2k: Unlocking the Power of Deep Models for Transparent Object
Tracking [41.039837388154]
We propose the first transparent object tracking training dataset Trans2k that consists of over 2k sequences with 104,343 images overall.
We quantify domain-specific attributes and render the dataset containing visual attributes and tracking situations not covered in the existing object training datasets.
The dataset and the rendering engine will be publicly released to unlock the power of modern learning-based trackers and foster new designs in transparent object tracking.
arXiv Detail & Related papers (2022-10-07T10:08:13Z) - TODE-Trans: Transparent Object Depth Estimation with Transformer [16.928131778902564]
We present a transformer-based transparent object depth estimation approach from a single RGB-D input.
To better enhance the fine-grained features, a feature fusion module (FFM) is designed to assist coherent prediction.
arXiv Detail & Related papers (2022-09-18T03:04:01Z) - Joint Spatial-Temporal and Appearance Modeling with Transformer for
Multiple Object Tracking [59.79252390626194]
We propose a novel solution named TransSTAM, which leverages Transformer to model both the appearance features of each object and the spatial-temporal relationships among objects.
The proposed method is evaluated on multiple public benchmarks including MOT16, MOT17, and MOT20, and it achieves a clear performance improvement in both IDF1 and HOTA.
arXiv Detail & Related papers (2022-05-31T01:19:18Z) - Backbone is All Your Need: A Simplified Architecture for Visual Object
Tracking [69.08903927311283]
Existing tracking approaches rely on customized sub-modules and need prior knowledge for architecture selection.
This paper presents a simplified tracking architecture (SimTrack) by leveraging a transformer backbone for joint feature extraction and interaction.
Our SimTrack improves the baseline with 2.5%/2.6% AUC gains on LaSOT/TNL2K and gets results competitive with other specialized tracking algorithms without bells and whistles.
arXiv Detail & Related papers (2022-03-10T12:20:58Z) - Transparent Object Tracking Benchmark [58.19532269423211]
Transparent Object Tracking Benchmark consists of 225 videos (86K frames) from 15 diverse transparent object categories.
To the best of our knowledge, TOTB is the first benchmark dedicated to transparent object tracking.
To encourage future research, we introduce a novel tracker, named TransATOM, which leverages transparency features for tracking.
arXiv Detail & Related papers (2020-11-21T21:39:43Z) - Jointly Modeling Motion and Appearance Cues for Robust RGB-T Tracking [85.333260415532]
We develop a novel late fusion method to infer the fusion weight maps of both RGB and thermal (T) modalities.
When the appearance cue is unreliable, we take motion cues into account to make the tracker robust.
Numerous results on three recent RGB-T tracking datasets show that the proposed tracker performs significantly better than other state-of-the-art algorithms.
arXiv Detail & Related papers (2020-07-04T08:11:33Z) - Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box
Estimation [87.53808756910452]
We propose a novel, flexible and accurate refinement module called Alpha-Refine.
It exploits a precise pixel-wise correlation layer together with a spatial-aware non-local layer to fuse features and can predict three complementary outputs: bounding box, corners and mask.
We apply the proposed Alpha-Refine module to five famous and state-of-the-art base trackers: DiMP, ATOM, SiamRPN++, RTMDNet and ECO.
arXiv Detail & Related papers (2020-07-04T07:02:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.