Long-Lived Accurate Keypoints in Event Streams
- URL: http://arxiv.org/abs/2209.10385v1
- Date: Wed, 21 Sep 2022 14:25:31 GMT
- Title: Long-Lived Accurate Keypoints in Event Streams
- Authors: Philippe Chiberre, Etienne Perot, Amos Sironi and Vincent Lepetit
- Abstract summary: We present a novel end-to-end approach to keypoint detection and tracking in an event stream.
We show it results in keypoint tracks that are three times longer and nearly twice as accurate as the best previous state-of-the-art methods.
- Score: 28.892653505044425
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel end-to-end approach to keypoint detection and tracking in
an event stream that provides better precision and much longer keypoint tracks
than previous methods. This is made possible by two contributions working
together.
First, we propose a simple procedure to generate stable keypoint labels,
which we use to train a recurrent architecture. This training data results in
detections that are very consistent over time.
Moreover, we observe that previous methods for keypoint detection work on a
representation (such as the time surface) that integrates events over a period
of time. Since this integration is required, we claim it is better to predict
the keypoints' trajectories for the time period rather than single locations,
as done in previous approaches. We predict these trajectories in the form of a
series of heatmaps for the integration time period. This improves the keypoint
localization.
Our architecture can also be kept very simple, which results in very fast
inference times. We demonstrate our approach on the HVGA ATIS Corner dataset as
well as "The Event-Camera Dataset and Simulator" dataset, and show it results
in keypoint tracks that are three times longer and nearly twice as accurate as
the best previous state-of-the-art methods. We believe our approach can be
generalized to other event-based camera problems, and we release our source
code to encourage other authors to explore it.
Related papers
- Dense Optical Tracking: Connecting the Dots [82.79642869586587]
DOT is a novel, simple and efficient method for solving the problem of point tracking in a video.
We show that DOT is significantly more accurate than current optical flow techniques, outperforms sophisticated "universal trackers" like OmniMotion, and is on par with, or better than, the best point tracking algorithms like CoTracker.
arXiv Detail & Related papers (2023-12-01T18:59:59Z) - TAPIR: Tracking Any Point with per-frame Initialization and temporal
Refinement [64.11385310305612]
We present a novel model for Tracking Any Point (TAP) that effectively tracks any queried point on any physical surface throughout a video sequence.
Our approach employs two stages: (1) a matching stage, which independently locates a suitable candidate point match for the query point on every other frame, and (2) a refinement stage, which updates both the trajectory and query features based on local correlations.
The resulting model surpasses all baseline methods by a significant margin on the TAP-Vid benchmark, as demonstrated by an approximate 20% absolute average Jaccard (AJ) improvement on DAVIS.
arXiv Detail & Related papers (2023-06-14T17:07:51Z) - KeyPosS: Plug-and-Play Facial Landmark Detection through GPS-Inspired
True-Range Multilateration [28.96448680048584]
KeyPoint Positioning System (KeyPosS) is first framework to deduce exact landmark coordinates by triangulating distances between points of interest and anchor points predicted by a fully convolutional network.
Experiments on four datasets demonstrate state-of-the-art performance, with KeyPosS outperforming existing methods in low-resolution settings despite minimal computational overhead.
arXiv Detail & Related papers (2023-05-25T19:30:21Z) - Stratified Transformer for 3D Point Cloud Segmentation [89.9698499437732]
Stratified Transformer is able to capture long-range contexts and demonstrates strong generalization ability and high performance.
To combat the challenges posed by irregular point arrangements, we propose first-layer point embedding to aggregate local information.
Experiments demonstrate the effectiveness and superiority of our method on S3DIS, ScanNetv2 and ShapeNetPart datasets.
arXiv Detail & Related papers (2022-03-28T05:35:16Z) - Rethinking Keypoint Representations: Modeling Keypoints and Poses as
Objects for Multi-Person Human Pose Estimation [79.78017059539526]
We propose a new heatmap-free keypoint estimation method in which individual keypoints and sets of spatially related keypoints (i.e., poses) are modeled as objects within a dense single-stage anchor-based detection framework.
In experiments, we observe that KAPAO is significantly faster and more accurate than previous methods, which suffer greatly from heatmap post-processing.
Our large model, KAPAO-L, achieves an AP of 70.6 on the Microsoft COCO Keypoints validation set without test-time augmentation.
arXiv Detail & Related papers (2021-11-16T15:36:44Z) - Accurate Grid Keypoint Learning for Efficient Video Prediction [87.71109421608232]
Keypoint-based video prediction methods can consume substantial computing resources in training and deployment.
In this paper, we design a new grid keypoint learning framework, aiming at a robust and explainable intermediate keypoint representation for long-term efficient video prediction.
Our method outperforms the state-ofthe-art video prediction methods while saves 98% more than computing resources.
arXiv Detail & Related papers (2021-07-28T05:04:30Z) - OpenPifPaf: Composite Fields for Semantic Keypoint Detection and
Spatio-Temporal Association [90.39247595214998]
Image-based perception tasks can be formulated as detecting, associating and semantic keypoints, e.g. human body pose estimation and tracking.
We present a general framework that jointly detects semantic andtemporal keypoint associations in a single stage.
We also show that our method generalizes to any class of keypoints such as car and animal parts to provide a holistic perception framework.
arXiv Detail & Related papers (2021-03-03T14:44:14Z) - RetinaTrack: Online Single Stage Joint Detection and Tracking [22.351109024452462]
We focus on the tracking-by-detection paradigm for autonomous driving where both tasks are mission critical.
We propose a conceptually simple and efficient joint model of detection and tracking, called RetinaTrack, which modifies the popular single stage RetinaNet approach.
arXiv Detail & Related papers (2020-03-30T23:46:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.