Human Pose-based Estimation, Tracking and Action Recognition with Deep
Learning: A Survey
- URL: http://arxiv.org/abs/2310.13039v1
- Date: Thu, 19 Oct 2023 17:59:04 GMT
- Title: Human Pose-based Estimation, Tracking and Action Recognition with Deep
Learning: A Survey
- Authors: Lijuan Zhou and Xiang Meng and Zhihuan Liu and Mengqi Wu and Zhimin
Gao and Pichao Wang
- Abstract summary: This paper presents a survey of pose-based applications utilizing deep learning, encompassing pose estimation, pose tracking, and action recognition.
Pose estimation involves the determination of human joint positions from images or image sequences.
Pose tracking is an emerging research direction aimed at generating consistent human pose trajectories over time.
Action recognition targets the identification of action types using pose estimation or tracking data.
- Score: 15.920237822185301
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human pose analysis has garnered significant attention within both the
research community and practical applications, owing to its expanding array of
uses, including gaming, video surveillance, sports performance analysis, and
human-computer interactions, among others. The advent of deep learning has
significantly improved the accuracy of pose capture, making pose-based
applications increasingly practical. This paper presents a comprehensive survey
of pose-based applications utilizing deep learning, encompassing pose
estimation, pose tracking, and action recognition.Pose estimation involves the
determination of human joint positions from images or image sequences. Pose
tracking is an emerging research direction aimed at generating consistent human
pose trajectories over time. Action recognition, on the other hand, targets the
identification of action types using pose estimation or tracking data. These
three tasks are intricately interconnected, with the latter often reliant on
the former. In this survey, we comprehensively review related works, spanning
from single-person pose estimation to multi-person pose estimation, from 2D
pose estimation to 3D pose estimation, from single image to video, from mining
temporal context gradually to pose tracking, and lastly from tracking to
pose-based action recognition. As a survey centered on the application of deep
learning to pose analysis, we explicitly discuss both the strengths and
limitations of existing techniques. Notably, we emphasize methodologies for
integrating these three tasks into a unified framework within video sequences.
Additionally, we explore the challenges involved and outline potential
directions for future research.
Related papers
- Deep Learning-Based Object Pose Estimation: A Comprehensive Survey [73.74933379151419]
We discuss the recent advances in deep learning-based object pose estimation.
Our survey also covers multiple input data modalities, degrees-of-freedom of output poses, object properties, and downstream tasks.
arXiv Detail & Related papers (2024-05-13T14:44:22Z) - Improving Multi-Person Pose Tracking with A Confidence Network [37.84514614455588]
We develop a novel keypoint confidence network and a tracking pipeline to improve human detection and pose estimation.
Specifically, the keypoint confidence network is designed to determine whether each keypoint is occluded.
In the tracking pipeline, we propose the Bbox-revision module to reduce missing detection and the ID-retrieve module to correct lost trajectories.
arXiv Detail & Related papers (2023-10-29T06:36:27Z) - Understanding Pose and Appearance Disentanglement in 3D Human Pose
Estimation [72.50214227616728]
Several methods have proposed to learn image representations in a self-supervised fashion so as to disentangle the appearance information from the pose one.
We study disentanglement from the perspective of the self-supervised network, via diverse image synthesis experiments.
We design an adversarial strategy focusing on generating natural appearance changes of the subject, and against which we could expect a disentangled network to be robust.
arXiv Detail & Related papers (2023-09-20T22:22:21Z) - A survey of top-down approaches for human pose estimation [0.0]
State-of-the-art methods implemented with Deep Learning have brought remarkable results in the field of human pose estimation.
This paper aims to provide newcomers with an extensive review of deep learning methods-based 2D images for recognizing the pose of people.
arXiv Detail & Related papers (2022-02-05T23:27:46Z) - Learning Dynamics via Graph Neural Networks for Human Pose Estimation
and Tracking [98.91894395941766]
We propose a novel online approach to learning the pose dynamics, which are independent of pose detections in current fame.
Specifically, we derive this prediction of dynamics through a graph neural network(GNN) that explicitly accounts for both spatial-temporal and visual information.
Experiments on PoseTrack 2017 and PoseTrack 2018 datasets demonstrate that the proposed method achieves results superior to the state of the art on both human pose estimation and tracking tasks.
arXiv Detail & Related papers (2021-06-07T16:36:50Z) - Recent Advances in Monocular 2D and 3D Human Pose Estimation: A Deep
Learning Perspective [69.44384540002358]
We provide a comprehensive and holistic 2D-to-3D perspective to tackle this problem.
We categorize the mainstream and milestone approaches since the year 2014 under unified frameworks.
We also summarize the pose representation styles, benchmarks, evaluation metrics, and the quantitative performance of popular approaches.
arXiv Detail & Related papers (2021-04-23T11:07:07Z) - Deep Learning-Based Human Pose Estimation: A Survey [66.01917727294163]
Human pose estimation has drawn increasing attention during the past decade.
It has been utilized in a wide range of applications including human-computer interaction, motion analysis, augmented reality, and virtual reality.
Recent deep learning-based solutions have achieved high performance in human pose estimation.
arXiv Detail & Related papers (2020-12-24T18:49:06Z) - View-Invariant, Occlusion-Robust Probabilistic Embedding for Human Pose [36.384824115033304]
We propose an approach to learning a compact view-invariant embedding space from 2D body joint keypoints, without explicitly predicting 3D poses.
Experimental results show that our embedding model achieves higher accuracy when retrieving similar poses across different camera views.
arXiv Detail & Related papers (2020-10-23T17:58:35Z) - Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image
Synthesis [72.34794624243281]
We propose a self-supervised learning framework to disentangle variations from unlabeled video frames.
Our differentiable formalization, bridging the representation gap between the 3D pose and spatial part maps, allows us to operate on videos with diverse camera movements.
arXiv Detail & Related papers (2020-04-09T07:55:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.