Long-Lived Accurate Keypoints in Event Streams
- URL: http://arxiv.org/abs/2209.10385v1
- Date: Wed, 21 Sep 2022 14:25:31 GMT
- Title: Long-Lived Accurate Keypoints in Event Streams
- Authors: Philippe Chiberre, Etienne Perot, Amos Sironi and Vincent Lepetit
- Abstract summary: We present a novel end-to-end approach to keypoint detection and tracking in an event stream.
We show it results in keypoint tracks that are three times longer and nearly twice as accurate as the best previous state-of-the-art methods.
- Score: 28.892653505044425
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel end-to-end approach to keypoint detection and tracking in
an event stream that provides better precision and much longer keypoint tracks
than previous methods. This is made possible by two contributions working
together.
First, we propose a simple procedure to generate stable keypoint labels,
which we use to train a recurrent architecture. This training data results in
detections that are very consistent over time.
Moreover, we observe that previous methods for keypoint detection work on a
representation (such as the time surface) that integrates events over a period
of time. Since this integration is required, we claim it is better to predict
the keypoints' trajectories for the time period rather than single locations,
as done in previous approaches. We predict these trajectories in the form of a
series of heatmaps for the integration time period. This improves the keypoint
localization.
Our architecture can also be kept very simple, which results in very fast
inference times. We demonstrate our approach on the HVGA ATIS Corner dataset as
well as "The Event-Camera Dataset and Simulator" dataset, and show it results
in keypoint tracks that are three times longer and nearly twice as accurate as
the best previous state-of-the-art methods. We believe our approach can be
generalized to other event-based camera problems, and we release our source
code to encourage other authors to explore it.
Related papers
- Exploring Temporally-Aware Features for Point Tracking [58.63091479730935]
Chrono is a feature backbone specifically designed for point tracking with built-in temporal awareness.
Chrono achieves state-of-the-art performance in a refiner-free setting on the TAP-Vid-DAVIS and TAP-Vid-Kinetics datasets.
arXiv Detail & Related papers (2025-01-21T15:39:40Z) - Post-Hoc MOTS: Exploring the Capabilities of Time-Symmetric Multi-Object Tracking [0.37240490024629924]
A time-symmetric tracking methodology has been introduced for the detection, segmentation, and tracking of budding yeast cells in pre-recorded samples.
We aim to reveal the broader capabilities, advantages, and potential challenges of this architecture across various specifically designed scenarios.
We present an attention analysis of the tracking architecture for both pretrained and non-pretrained models.
arXiv Detail & Related papers (2024-12-11T11:50:06Z) - Dense Optical Tracking: Connecting the Dots [82.79642869586587]
DOT is a novel, simple and efficient method for solving the problem of point tracking in a video.
We show that DOT is significantly more accurate than current optical flow techniques, outperforms sophisticated "universal trackers" like OmniMotion, and is on par with, or better than, the best point tracking algorithms like CoTracker.
arXiv Detail & Related papers (2023-12-01T18:59:59Z) - TAPIR: Tracking Any Point with per-frame Initialization and temporal
Refinement [64.11385310305612]
We present a novel model for Tracking Any Point (TAP) that effectively tracks any queried point on any physical surface throughout a video sequence.
Our approach employs two stages: (1) a matching stage, which independently locates a suitable candidate point match for the query point on every other frame, and (2) a refinement stage, which updates both the trajectory and query features based on local correlations.
The resulting model surpasses all baseline methods by a significant margin on the TAP-Vid benchmark, as demonstrated by an approximate 20% absolute average Jaccard (AJ) improvement on DAVIS.
arXiv Detail & Related papers (2023-06-14T17:07:51Z) - KeyPosS: Plug-and-Play Facial Landmark Detection through GPS-Inspired
True-Range Multilateration [28.96448680048584]
KeyPoint Positioning System (KeyPosS) is first framework to deduce exact landmark coordinates by triangulating distances between points of interest and anchor points predicted by a fully convolutional network.
Experiments on four datasets demonstrate state-of-the-art performance, with KeyPosS outperforming existing methods in low-resolution settings despite minimal computational overhead.
arXiv Detail & Related papers (2023-05-25T19:30:21Z) - Stratified Transformer for 3D Point Cloud Segmentation [89.9698499437732]
Stratified Transformer is able to capture long-range contexts and demonstrates strong generalization ability and high performance.
To combat the challenges posed by irregular point arrangements, we propose first-layer point embedding to aggregate local information.
Experiments demonstrate the effectiveness and superiority of our method on S3DIS, ScanNetv2 and ShapeNetPart datasets.
arXiv Detail & Related papers (2022-03-28T05:35:16Z) - Accurate Grid Keypoint Learning for Efficient Video Prediction [87.71109421608232]
Keypoint-based video prediction methods can consume substantial computing resources in training and deployment.
In this paper, we design a new grid keypoint learning framework, aiming at a robust and explainable intermediate keypoint representation for long-term efficient video prediction.
Our method outperforms the state-ofthe-art video prediction methods while saves 98% more than computing resources.
arXiv Detail & Related papers (2021-07-28T05:04:30Z) - OpenPifPaf: Composite Fields for Semantic Keypoint Detection and
Spatio-Temporal Association [90.39247595214998]
Image-based perception tasks can be formulated as detecting, associating and semantic keypoints, e.g. human body pose estimation and tracking.
We present a general framework that jointly detects semantic andtemporal keypoint associations in a single stage.
We also show that our method generalizes to any class of keypoints such as car and animal parts to provide a holistic perception framework.
arXiv Detail & Related papers (2021-03-03T14:44:14Z) - RetinaTrack: Online Single Stage Joint Detection and Tracking [22.351109024452462]
We focus on the tracking-by-detection paradigm for autonomous driving where both tasks are mission critical.
We propose a conceptually simple and efficient joint model of detection and tracking, called RetinaTrack, which modifies the popular single stage RetinaNet approach.
arXiv Detail & Related papers (2020-03-30T23:46:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.