Solution for Point Tracking Task of ICCV 1st Perception Test Challenge 2023
- URL: http://arxiv.org/abs/2403.17994v1
- Date: Tue, 26 Mar 2024 13:50:39 GMT
- Title: Solution for Point Tracking Task of ICCV 1st Perception Test Challenge 2023
- Authors: Hongpeng Pan, Yang Yang, Zhongtian Fu, Yuxuan Zhang, Shian Du, Yi Xu, Xiangyang Ji,
- Abstract summary: The Tracking Any Point (TAP) task tracks any physical surface through a video.
Several existing approaches have explored the TAP by considering the temporal relationships to obtain smooth point motion trajectories.
We propose a simple yet effective approach called TAP with confident static points (TAPIR+), which focuses on rectifying the tracking of the static point in the videos shot by a static camera.
- Score: 50.910598799408326
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This report proposes an improved method for the Tracking Any Point (TAP) task, which tracks any physical surface through a video. Several existing approaches have explored the TAP by considering the temporal relationships to obtain smooth point motion trajectories, however, they still suffer from the cumulative error caused by temporal prediction. To address this issue, we propose a simple yet effective approach called TAP with confident static points (TAPIR+), which focuses on rectifying the tracking of the static point in the videos shot by a static camera. To clarify, our approach contains two key components: (1) Multi-granularity Camera Motion Detection, which could identify the video sequence by the static camera shot. (2) CMR-based point trajectory prediction with one moving object segmentation approach to isolate the static point from the moving object. Our approach ranked first in the final test with a score of 0.46.
Related papers
- DATAP-SfM: Dynamic-Aware Tracking Any Point for Robust Structure from Motion in the Wild [85.03973683867797]
This paper proposes a concise, elegant, and robust pipeline to estimate smooth camera trajectories and obtain dense point clouds for casual videos in the wild.
We show that the proposed method achieves state-of-the-art performance in terms of camera pose estimation even in complex dynamic challenge scenes.
arXiv Detail & Related papers (2024-11-20T13:01:16Z) - Solution for Point Tracking Task of ECCV 2nd Perception Test Challenge 2024 [13.14886222358538]
This report introduces an improved method for the Tracking Any Point(TAP), focusing on monitoring physical surfaces in video footage.
We propose a simple yet effective approach called Fine-grained Point Discrimination(textbfFPD), which focuses on perceiving and rectifying point tracking at multiple granularities in zero-shot manner.
arXiv Detail & Related papers (2024-10-05T15:09:40Z) - TAP-Vid: A Benchmark for Tracking Any Point in a Video [84.94877216665793]
We formalize the problem of tracking arbitrary physical points on surfaces over longer video clips, naming it tracking any point (TAP)
We introduce a companion benchmark, TAP-Vid, which is composed of both real-world videos with accurate human annotations of point tracks, and synthetic videos with perfect ground-truth point tracks.
We propose a simple end-to-end point tracking model TAP-Net, showing that it outperforms all prior methods on our benchmark when trained on synthetic data.
arXiv Detail & Related papers (2022-11-07T17:57:02Z) - ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving
Cameras in the Wild [57.37891682117178]
We present a robust dense indirect structure-from-motion method for videos that is based on dense correspondence from pairwise optical flow.
A novel neural network architecture is proposed for processing irregular point trajectory data.
Experiments on MPI Sintel dataset show that our system produces significantly more accurate camera trajectories.
arXiv Detail & Related papers (2022-07-19T09:19:45Z) - Implicit Motion Handling for Video Camouflaged Object Detection [60.98467179649398]
We propose a new video camouflaged object detection (VCOD) framework.
It can exploit both short-term and long-term temporal consistency to detect camouflaged objects from video frames.
arXiv Detail & Related papers (2022-03-14T17:55:41Z) - End-to-end Learning of Object Motion Estimation from Retinal Events for
Event-based Object Tracking [35.95703377642108]
We propose a novel deep neural network to learn and regress a parametric object-level motion/transform model for event-based object tracking.
To achieve this goal, we propose a synchronous Time-Surface with Linear Time Decay representation.
We feed the sequence of TSLTD frames to a novel Retinal Motion Regression Network (RMRNet) perform to an end-to-end 5-DoF object motion regression.
arXiv Detail & Related papers (2020-02-14T08:19:50Z) - Asynchronous Tracking-by-Detection on Adaptive Time Surfaces for
Event-based Object Tracking [87.0297771292994]
We propose an Event-based Tracking-by-Detection (ETD) method for generic bounding box-based object tracking.
To achieve this goal, we present an Adaptive Time-Surface with Linear Time Decay (ATSLTD) event-to-frame conversion algorithm.
We compare the proposed ETD method with seven popular object tracking methods, that are based on conventional cameras or event cameras, and two variants of ETD.
arXiv Detail & Related papers (2020-02-13T15:58:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.