Related papers: Transparent Object Tracking with Enhanced Fusion Module

Transparent Object Tracking with Enhanced Fusion Module

URL: http://arxiv.org/abs/2309.06701v1
Date: Wed, 13 Sep 2023 03:52:09 GMT
Title: Transparent Object Tracking with Enhanced Fusion Module
Authors: Kalyan Garigapati, Erik Blasch, Jie Wei, Haibin Ling
Abstract summary: We propose a new tracker architecture that uses our fusion techniques to achieve superior results for transparent object tracking. Our results and the implementation of code will be made publicly available at https://github.com/kalyan05TOTEM.
Score: 56.403878717170784
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Accurate tracking of transparent objects, such as glasses, plays a critical role in many robotic tasks such as robot-assisted living. Due to the adaptive and often reflective texture of such objects, traditional tracking algorithms that rely on general-purpose learned features suffer from reduced performance. Recent research has proposed to instill transparency awareness into existing general object trackers by fusing purpose-built features. However, with the existing fusion techniques, the addition of new features causes a change in the latent space making it impossible to incorporate transparency awareness on trackers with fixed latent spaces. For example, many of the current days transformer-based trackers are fully pre-trained and are sensitive to any latent space perturbations. In this paper, we present a new feature fusion technique that integrates transparency information into a fixed feature space, enabling its use in a broader range of trackers. Our proposed fusion module, composed of a transformer encoder and an MLP module, leverages key query-based transformations to embed the transparency information into the tracking pipeline. We also present a new two-step training strategy for our fusion module to effectively merge transparency features. We propose a new tracker architecture that uses our fusion techniques to achieve superior results for transparent object tracking. Our proposed method achieves competitive results with state-of-the-art trackers on TOTB, which is the largest transparent object tracking benchmark recently released. Our results and the implementation of code will be made publicly available at https://github.com/kalyan0510/TOTEM.

Related papers

CAMELTrack: Context-Aware Multi-cue ExpLoitation for Online Multi-Object Tracking [68.24998698508344]
We introduce CAMEL, a novel association module for Context-Aware Multi-Cue ExpLoitation.<n>Unlike end-to-end detection-by-tracking approaches, our method remains lightweight and fast to train while being able to leverage external off-the-shelf models.<n>Our proposed online tracking pipeline, CAMELTrack, achieves state-of-the-art performance on multiple tracking benchmarks.
arXiv Detail & Related papers (2025-05-02T13:26:23Z)
A New Dataset and a Distractor-Aware Architecture for Transparent Object Tracking [34.08943612955157]
Performance of modern trackers degrades substantially on transparent objects compared to opaque objects. We propose the first transparent object tracking training dataset Trans2k that consists of over 2k sequences with 104,343 images overall. We also present a new distractor-aware transparent object tracker (DiTra) that treats localization accuracy and target identification as separate tasks.
arXiv Detail & Related papers (2024-01-08T13:04:28Z)
TransTouch: Learning Transparent Objects Depth Sensing Through Sparse Touches [23.87056600709768]
We propose a method to finetune a stereo network with sparse depth labels automatically collected using a probing system with tactile feedback. We show that our method can significantly improve real-world depth sensing accuracy, especially for transparent objects.
arXiv Detail & Related papers (2023-09-18T01:55:17Z)
Trans2k: Unlocking the Power of Deep Models for Transparent Object Tracking [41.039837388154]
We propose the first transparent object tracking training dataset Trans2k that consists of over 2k sequences with 104,343 images overall. We quantify domain-specific attributes and render the dataset containing visual attributes and tracking situations not covered in the existing object training datasets. The dataset and the rendering engine will be publicly released to unlock the power of modern learning-based trackers and foster new designs in transparent object tracking.
arXiv Detail & Related papers (2022-10-07T10:08:13Z)
TODE-Trans: Transparent Object Depth Estimation with Transformer [16.928131778902564]
We present a transformer-based transparent object depth estimation approach from a single RGB-D input. To better enhance the fine-grained features, a feature fusion module (FFM) is designed to assist coherent prediction.
arXiv Detail & Related papers (2022-09-18T03:04:01Z)
Joint Spatial-Temporal and Appearance Modeling with Transformer for Multiple Object Tracking [59.79252390626194]
We propose a novel solution named TransSTAM, which leverages Transformer to model both the appearance features of each object and the spatial-temporal relationships among objects. The proposed method is evaluated on multiple public benchmarks including MOT16, MOT17, and MOT20, and it achieves a clear performance improvement in both IDF1 and HOTA.
arXiv Detail & Related papers (2022-05-31T01:19:18Z)
Backbone is All Your Need: A Simplified Architecture for Visual Object Tracking [69.08903927311283]
Existing tracking approaches rely on customized sub-modules and need prior knowledge for architecture selection. This paper presents a simplified tracking architecture (SimTrack) by leveraging a transformer backbone for joint feature extraction and interaction. Our SimTrack improves the baseline with 2.5%/2.6% AUC gains on LaSOT/TNL2K and gets results competitive with other specialized tracking algorithms without bells and whistles.
arXiv Detail & Related papers (2022-03-10T12:20:58Z)
Transparent Object Tracking Benchmark [58.19532269423211]
Transparent Object Tracking Benchmark consists of 225 videos (86K frames) from 15 diverse transparent object categories. To the best of our knowledge, TOTB is the first benchmark dedicated to transparent object tracking. To encourage future research, we introduce a novel tracker, named TransATOM, which leverages transparency features for tracking.
arXiv Detail & Related papers (2020-11-21T21:39:43Z)
Jointly Modeling Motion and Appearance Cues for Robust RGB-T Tracking [85.333260415532]
We develop a novel late fusion method to infer the fusion weight maps of both RGB and thermal (T) modalities. When the appearance cue is unreliable, we take motion cues into account to make the tracker robust. Numerous results on three recent RGB-T tracking datasets show that the proposed tracker performs significantly better than other state-of-the-art algorithms.
arXiv Detail & Related papers (2020-07-04T08:11:33Z)
Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box Estimation [87.53808756910452]
We propose a novel, flexible and accurate refinement module called Alpha-Refine. It exploits a precise pixel-wise correlation layer together with a spatial-aware non-local layer to fuse features and can predict three complementary outputs: bounding box, corners and mask. We apply the proposed Alpha-Refine module to five famous and state-of-the-art base trackers: DiMP, ATOM, SiamRPN++, RTMDNet and ECO.
arXiv Detail & Related papers (2020-07-04T07:02:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.