OpenPifPaf: Composite Fields for Semantic Keypoint Detection and
Spatio-Temporal Association
- URL: http://arxiv.org/abs/2103.02440v1
- Date: Wed, 3 Mar 2021 14:44:14 GMT
- Title: OpenPifPaf: Composite Fields for Semantic Keypoint Detection and
Spatio-Temporal Association
- Authors: Sven Kreiss, Lorenzo Bertoni, Alexandre Alahi
- Abstract summary: Image-based perception tasks can be formulated as detecting, associating and semantic keypoints, e.g. human body pose estimation and tracking.
We present a general framework that jointly detects semantic andtemporal keypoint associations in a single stage.
We also show that our method generalizes to any class of keypoints such as car and animal parts to provide a holistic perception framework.
- Score: 90.39247595214998
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many image-based perception tasks can be formulated as detecting, associating
and tracking semantic keypoints, e.g., human body pose estimation and tracking.
In this work, we present a general framework that jointly detects and forms
spatio-temporal keypoint associations in a single stage, making this the first
real-time pose detection and tracking algorithm. We present a generic neural
network architecture that uses Composite Fields to detect and construct a
spatio-temporal pose which is a single, connected graph whose nodes are the
semantic keypoints (e.g., a person's body joints) in multiple frames. For the
temporal associations, we introduce the Temporal Composite Association Field
(TCAF) which requires an extended network architecture and training method
beyond previous Composite Fields. Our experiments show competitive accuracy
while being an order of magnitude faster on multiple publicly available
datasets such as COCO, CrowdPose and the PoseTrack 2017 and 2018 datasets. We
also show that our method generalizes to any class of semantic keypoints such
as car and animal parts to provide a holistic perception framework that is well
suited for urban mobility such as self-driving cars and delivery robots.
Related papers
- Understanding Spatio-Temporal Relations in Human-Object Interaction using Pyramid Graph Convolutional Network [2.223052975765005]
We propose a novel Pyramid Graph Convolutional Network (PGCN) to automatically recognize human-object interaction.
The system represents the 2D or 3D spatial relation of human and objects from the detection results in video data as a graph.
We evaluate our model on two challenging datasets in the field of human-object interaction recognition.
arXiv Detail & Related papers (2024-10-10T13:39:17Z) - GEARS: Local Geometry-aware Hand-object Interaction Synthesis [38.75942505771009]
We introduce a novel joint-centered sensor designed to reason about local object geometry near potential interaction regions.
As an important step towards mitigating the learning complexity, we transform the points from global frame to template hand frame and use a shared module to process sensor features of each individual joint.
This is followed by a perceptual-temporal transformer network aimed at capturing correlation among the joints in different dimensions.
arXiv Detail & Related papers (2024-04-02T09:18:52Z) - A Spatio-Temporal Multilayer Perceptron for Gesture Recognition [70.34489104710366]
We propose a multilayer state-weighted perceptron for gesture recognition in the context of autonomous vehicles.
An evaluation of TCG and Drive&Act datasets is provided to showcase the promising performance of our approach.
We deploy our model to our autonomous vehicle to show its real-time capability and stable execution.
arXiv Detail & Related papers (2022-04-25T08:42:47Z) - HighlightMe: Detecting Highlights from Human-Centric Videos [52.84233165201391]
We present a domain- and user-preference-agnostic approach to detect highlightable excerpts from human-centric videos.
We use an autoencoder network equipped with spatial-temporal graph convolutions to detect human activities and interactions.
We observe a 4-12% improvement in the mean average precision of matching the human-annotated highlights over state-of-the-art methods.
arXiv Detail & Related papers (2021-10-05T01:18:15Z) - Learning Spatial Context with Graph Neural Network for Multi-Person Pose
Grouping [71.59494156155309]
Bottom-up approaches for image-based multi-person pose estimation consist of two stages: keypoint detection and grouping.
In this work, we formulate the grouping task as a graph partitioning problem, where we learn the affinity matrix with a Graph Neural Network (GNN)
The learned geometry-based affinity is further fused with appearance-based affinity to achieve robust keypoint association.
arXiv Detail & Related papers (2021-04-06T09:21:14Z) - Learning to Track with Object Permanence [61.36492084090744]
We introduce an end-to-end trainable approach for joint object detection and tracking.
Our model, trained jointly on synthetic and real data, outperforms the state of the art on KITTI, and MOT17 datasets.
arXiv Detail & Related papers (2021-03-26T04:43:04Z) - LIGHTEN: Learning Interactions with Graph and Hierarchical TEmporal
Networks for HOI in videos [13.25502885135043]
Analyzing the interactions between humans and objects from a video includes identification of relationships between humans and the objects present in the video.
We present a hierarchical approach, LIGHTEN, to learn visual features to effectively capture truth at multiple granularities in a video.
We achieve state-of-the-art results in human-object interaction detection (88.9% and 92.6%) and anticipation tasks of CAD-120 and competitive results on image based HOI detection in V-COCO.
arXiv Detail & Related papers (2020-12-17T05:44:07Z) - Self-supervised Human Detection and Segmentation via Multi-view
Consensus [116.92405645348185]
We propose a multi-camera framework in which geometric constraints are embedded in the form of multi-view consistency during training.
We show that our approach outperforms state-of-the-art self-supervised person detection and segmentation techniques on images that visually depart from those of standard benchmarks.
arXiv Detail & Related papers (2020-12-09T15:47:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.