Related papers: Long-Lived Accurate Keypoints in Event Streams

Long-Lived Accurate Keypoints in Event Streams

URL: http://arxiv.org/abs/2209.10385v1
Date: Wed, 21 Sep 2022 14:25:31 GMT
Title: Long-Lived Accurate Keypoints in Event Streams
Authors: Philippe Chiberre, Etienne Perot, Amos Sironi and Vincent Lepetit
Abstract summary: We present a novel end-to-end approach to keypoint detection and tracking in an event stream. We show it results in keypoint tracks that are three times longer and nearly twice as accurate as the best previous state-of-the-art methods.
Score: 28.892653505044425
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a novel end-to-end approach to keypoint detection and tracking in an event stream that provides better precision and much longer keypoint tracks than previous methods. This is made possible by two contributions working together. First, we propose a simple procedure to generate stable keypoint labels, which we use to train a recurrent architecture. This training data results in detections that are very consistent over time. Moreover, we observe that previous methods for keypoint detection work on a representation (such as the time surface) that integrates events over a period of time. Since this integration is required, we claim it is better to predict the keypoints' trajectories for the time period rather than single locations, as done in previous approaches. We predict these trajectories in the form of a series of heatmaps for the integration time period. This improves the keypoint localization. Our architecture can also be kept very simple, which results in very fast inference times. We demonstrate our approach on the HVGA ATIS Corner dataset as well as "The Event-Camera Dataset and Simulator" dataset, and show it results in keypoint tracks that are three times longer and nearly twice as accurate as the best previous state-of-the-art methods. We believe our approach can be generalized to other event-based camera problems, and we release our source code to encourage other authors to explore it.

Related papers

DELTAv2: Accelerating Dense 3D Tracking [79.63990337419514]
We propose a novel algorithm for accelerating dense long-term 3D point tracking in videos.<n>We introduce a coarse-to-fine strategy that begins tracking with a small subset of points and progressively expands the set of tracked trajectories.<n>The newly added trajectories are using a learnable module, which is trained end-to-end alongside the tracking network.
arXiv Detail & Related papers (2025-08-02T03:15:47Z)
Exploring Temporally-Aware Features for Point Tracking [58.63091479730935]
Chrono is a feature backbone specifically designed for point tracking with built-in temporal awareness. Chrono achieves state-of-the-art performance in a refiner-free setting on the TAP-Vid-DAVIS and TAP-Vid-Kinetics datasets.
arXiv Detail & Related papers (2025-01-21T15:39:40Z)
ProTracker: Probabilistic Integration for Robust and Accurate Point Tracking [41.889032460337226]
ProTracker is a novel framework for accurate and robust long-term dense tracking of arbitrary points in videos. This design effectively combines global semantic information with temporally aware low-level features. Experiments demonstrate that ProTracker attains state-of-the-art performance among optimization-based approaches.
arXiv Detail & Related papers (2025-01-06T18:55:52Z)
Post-Hoc MOTS: Exploring the Capabilities of Time-Symmetric Multi-Object Tracking [0.37240490024629924]
A time-symmetric tracking methodology has been introduced for the detection, segmentation, and tracking of budding yeast cells in pre-recorded samples. We aim to reveal the broader capabilities, advantages, and potential challenges of this architecture across various specifically designed scenarios. We present an attention analysis of the tracking architecture for both pretrained and non-pretrained models.
arXiv Detail & Related papers (2024-12-11T11:50:06Z)
Dense Optical Tracking: Connecting the Dots [82.79642869586587]
DOT is a novel, simple and efficient method for solving the problem of point tracking in a video. We show that DOT is significantly more accurate than current optical flow techniques, outperforms sophisticated "universal trackers" like OmniMotion, and is on par with, or better than, the best point tracking algorithms like CoTracker.
arXiv Detail & Related papers (2023-12-01T18:59:59Z)
TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement [64.11385310305612]
We present a novel model for Tracking Any Point (TAP) that effectively tracks any queried point on any physical surface throughout a video sequence. Our approach employs two stages: (1) a matching stage, which independently locates a suitable candidate point match for the query point on every other frame, and (2) a refinement stage, which updates both the trajectory and query features based on local correlations. The resulting model surpasses all baseline methods by a significant margin on the TAP-Vid benchmark, as demonstrated by an approximate 20% absolute average Jaccard (AJ) improvement on DAVIS.
arXiv Detail & Related papers (2023-06-14T17:07:51Z)
KeyPosS: Plug-and-Play Facial Landmark Detection through GPS-Inspired True-Range Multilateration [28.96448680048584]
KeyPoint Positioning System (KeyPosS) is first framework to deduce exact landmark coordinates by triangulating distances between points of interest and anchor points predicted by a fully convolutional network. Experiments on four datasets demonstrate state-of-the-art performance, with KeyPosS outperforming existing methods in low-resolution settings despite minimal computational overhead.
arXiv Detail & Related papers (2023-05-25T19:30:21Z)
Stratified Transformer for 3D Point Cloud Segmentation [89.9698499437732]
Stratified Transformer is able to capture long-range contexts and demonstrates strong generalization ability and high performance. To combat the challenges posed by irregular point arrangements, we propose first-layer point embedding to aggregate local information. Experiments demonstrate the effectiveness and superiority of our method on S3DIS, ScanNetv2 and ShapeNetPart datasets.
arXiv Detail & Related papers (2022-03-28T05:35:16Z)
Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation [79.78017059539526]
We propose a new heatmap-free keypoint estimation method in which individual keypoints and sets of spatially related keypoints (i.e., poses) are modeled as objects within a dense single-stage anchor-based detection framework. In experiments, we observe that KAPAO is significantly faster and more accurate than previous methods, which suffer greatly from heatmap post-processing. Our large model, KAPAO-L, achieves an AP of 70.6 on the Microsoft COCO Keypoints validation set without test-time augmentation.
arXiv Detail & Related papers (2021-11-16T15:36:44Z)
Accurate Grid Keypoint Learning for Efficient Video Prediction [87.71109421608232]
Keypoint-based video prediction methods can consume substantial computing resources in training and deployment. In this paper, we design a new grid keypoint learning framework, aiming at a robust and explainable intermediate keypoint representation for long-term efficient video prediction. Our method outperforms the state-ofthe-art video prediction methods while saves 98% more than computing resources.
arXiv Detail & Related papers (2021-07-28T05:04:30Z)
OpenPifPaf: Composite Fields for Semantic Keypoint Detection and Spatio-Temporal Association [90.39247595214998]
Image-based perception tasks can be formulated as detecting, associating and semantic keypoints, e.g. human body pose estimation and tracking. We present a general framework that jointly detects semantic andtemporal keypoint associations in a single stage. We also show that our method generalizes to any class of keypoints such as car and animal parts to provide a holistic perception framework.
arXiv Detail & Related papers (2021-03-03T14:44:14Z)
RetinaTrack: Online Single Stage Joint Detection and Tracking [22.351109024452462]
We focus on the tracking-by-detection paradigm for autonomous driving where both tasks are mission critical. We propose a conceptually simple and efficient joint model of detection and tracking, called RetinaTrack, which modifies the popular single stage RetinaNet approach.
arXiv Detail & Related papers (2020-03-30T23:46:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.