Enhanced Transformer-Based Tracking for Skiing Events: Overcoming Multi-Camera Challenges, Scale Variations and Rapid Motion -- SkiTB Visual Tracking Challenge 2025
- URL: http://arxiv.org/abs/2502.18867v1
- Date: Wed, 26 Feb 2025 06:26:33 GMT
- Title: Enhanced Transformer-Based Tracking for Skiing Events: Overcoming Multi-Camera Challenges, Scale Variations and Rapid Motion -- SkiTB Visual Tracking Challenge 2025
- Authors: Akhil Penta, Vaibhav Adwani, Ankush Chopra,
- Abstract summary: We use STARK (Spatio-Temporal Transformer Network for Visual Tracking), a transformer-based model, to track skiers.<n>We adapted STARK to address domain-specific challenges such as camera movements, camera changes, occlusions, etc.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate skier tracking is essential for performance analysis, injury prevention, and optimizing training strategies in alpine sports. Traditional tracking methods often struggle with occlusions, dynamic movements, and varying environmental conditions, limiting their effectiveness. In this work, we used STARK (Spatio-Temporal Transformer Network for Visual Tracking), a transformer-based model, to track skiers. We adapted STARK to address domain-specific challenges such as camera movements, camera changes, occlusions, etc. by optimizing the model's architecture and hyperparameters to better suit the dataset.
Related papers
- Monte Carlo Tree Search with Velocity Obstacles for safe and efficient motion planning in dynamic environments [49.30744329170107]
We propose a novel approach for optimal online motion planning with minimal information about dynamic obstacles.
The proposed methodology combines Monte Carlo Tree Search (MCTS), for online optimal planning via model simulations, with Velocity Obstacles (VO), for obstacle avoidance.
We show the superiority of our methodology with respect to state-of-the-art planners, including Non-linear Model Predictive Control (NMPC), in terms of improved collision rate, computational and task performance.
arXiv Detail & Related papers (2025-01-16T16:45:08Z) - A Cross-Scene Benchmark for Open-World Drone Active Tracking [54.235808061746525]
Drone Visual Active Tracking aims to autonomously follow a target object by controlling the motion system based on visual observations.<n>We propose a unified cross-scene cross-domain benchmark for open-world drone active tracking called DAT.<n>We also propose a reinforcement learning-based drone tracking method called R-VAT.
arXiv Detail & Related papers (2024-12-01T09:37:46Z) - Motion Segmentation for Neuromorphic Aerial Surveillance [42.04157319642197]
Event cameras offer superior temporal resolution, superior dynamic range, and minimal power requirements.
Unlike traditional frame-based sensors that capture redundant information at fixed intervals, event cameras asynchronously record pixel-level brightness changes.
We introduce a novel motion segmentation method that leverages self-supervised vision transformers on both event data and optical flow information.
arXiv Detail & Related papers (2024-05-24T04:36:13Z) - Exploring Dynamic Transformer for Efficient Object Tracking [58.120191254379854]
We propose DyTrack, a dynamic transformer framework for efficient tracking.
DyTrack automatically learns to configure proper reasoning routes for various inputs, gaining better utilization of the available computational budget.
Experiments on multiple benchmarks demonstrate that DyTrack achieves promising speed-precision trade-offs with only a single model.
arXiv Detail & Related papers (2024-03-26T12:31:58Z) - Distractor-aware Event-based Tracking [45.07711356111249]
We propose a distractor-aware event-based tracker that introduces transformer modules into Siamese network architecture (named DANet)
Our model is mainly composed of a motion-aware network and a target-aware network, which simultaneously exploits both motion cues and object contours from event data.
Our DANet can be trained in an end-to-end manner without any post-processing and can run at over 80 FPS on a single V100.
arXiv Detail & Related papers (2023-10-22T05:50:20Z) - Leveraging the Power of Data Augmentation for Transformer-based Tracking [64.46371987827312]
We propose two data augmentation methods customized for tracking.
First, we optimize existing random cropping via a dynamic search radius mechanism and simulation for boundary samples.
Second, we propose a token-level feature mixing augmentation strategy, which enables the model against challenges like background interference.
arXiv Detail & Related papers (2023-09-15T09:18:54Z) - MotionTrack: Learning Motion Predictor for Multiple Object Tracking [68.68339102749358]
We introduce a novel motion-based tracker, MotionTrack, centered around a learnable motion predictor.
Our experimental results demonstrate that MotionTrack yields state-of-the-art performance on datasets such as Dancetrack and SportsMOT.
arXiv Detail & Related papers (2023-06-05T04:24:11Z) - Alignment-free HDR Deghosting with Semantics Consistent Transformer [76.91669741684173]
High dynamic range imaging aims to retrieve information from multiple low-dynamic range inputs to generate realistic output.
Existing methods often focus on the spatial misalignment across input frames caused by the foreground and/or camera motion.
We propose a novel alignment-free network with a Semantics Consistent Transformer (SCTNet) with both spatial and channel attention modules.
arXiv Detail & Related papers (2023-05-29T15:03:23Z) - Propagate And Calibrate: Real-time Passive Non-line-of-sight Tracking [84.38335117043907]
We propose a purely passive method to track a person walking in an invisible room by only observing a relay wall.
To excavate imperceptible changes in videos of the relay wall, we introduce difference frames as an essential carrier of temporal-local motion messages.
To evaluate the proposed method, we build and publish the first dynamic passive NLOS tracking dataset, NLOS-Track.
arXiv Detail & Related papers (2023-03-21T12:18:57Z) - SGDViT: Saliency-Guided Dynamic Vision Transformer for UAV Tracking [12.447854608181833]
This work presents a novel saliency-guided dynamic vision Transformer (SGDViT) for UAV tracking.
The proposed method designs a new task-specific object saliency mining network to refine the cross-correlation operation.
A lightweight saliency filtering Transformer further refines saliency information and increases the focus on appearance information.
arXiv Detail & Related papers (2023-03-08T05:01:00Z) - Learning to Jump from Pixels [23.17535989519855]
We present Depth-based Impulse Control (DIC), a method for synthesizing highly agile visually-guided behaviors.
DIC affords the flexibility of model-free learning but regularizes behavior through explicit model-based optimization of ground reaction forces.
We evaluate the proposed method both in simulation and in the real world.
arXiv Detail & Related papers (2021-10-28T17:53:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.