SimpleDepthPose: Fast and Reliable Human Pose Estimation with RGBD-Images
- URL: http://arxiv.org/abs/2501.18478v1
- Date: Thu, 30 Jan 2025 16:51:40 GMT
- Title: SimpleDepthPose: Fast and Reliable Human Pose Estimation with RGBD-Images
- Authors: Daniel Bermuth, Alexander Poeppel, Wolfgang Reif,
- Abstract summary: This paper introduces a novel algorithm that excels in multi-view, multi-person pose estimation by incorporating depth information.
An extensive evaluation demonstrates that the proposed algorithm not only generalizes well to unseen datasets, and shows a fast runtime performance, but also is adaptable to different keypoints.
- Score: 45.085830389820956
- License:
- Abstract: In the rapidly advancing domain of computer vision, accurately estimating the poses of multiple individuals from various viewpoints remains a significant challenge, especially when reliability is a key requirement. This paper introduces a novel algorithm that excels in multi-view, multi-person pose estimation by incorporating depth information. An extensive evaluation demonstrates that the proposed algorithm not only generalizes well to unseen datasets, and shows a fast runtime performance, but also is adaptable to different keypoints. To support further research, all of the work is publicly accessible.
Related papers
- VoxelKeypointFusion: Generalizable Multi-View Multi-Person Pose Estimation [45.085830389820956]
This work presents an evaluation of the generalization capabilities of multi-view multi-person pose estimators to unseen datasets.
It also studies the improvements by additionally using depth information.
Since the new approach can not only generalize well to unseen datasets, but also to different keypoints, the first multi-view multi-person whole-body estimator is presented.
arXiv Detail & Related papers (2024-10-24T13:28:40Z) - You Only Learn One Query: Learning Unified Human Query for Single-Stage Multi-Person Multi-Task Human-Centric Perception [37.667147915777534]
Human-centric perception is a long-standing problem for computer vision.
This paper introduces a unified and versatile framework (HQNet) for single-stage multi-person multi-task human-centric perception (HCP)
Human Query captures intricate instance-level features for individual persons and disentangles complex multi-person scenarios.
arXiv Detail & Related papers (2023-12-09T10:36:43Z) - A Threefold Review on Deep Semantic Segmentation: Efficiency-oriented,
Temporal and Depth-aware design [77.34726150561087]
We conduct a survey on the most relevant and recent advances in Deep Semantic in the context of vision for autonomous vehicles.
Our main objective is to provide a comprehensive discussion on the main methods, advantages, limitations, results and challenges faced from each perspective.
arXiv Detail & Related papers (2023-03-08T01:29:55Z) - Snipper: A Spatiotemporal Transformer for Simultaneous Multi-Person 3D
Pose Estimation Tracking and Forecasting on a Video Snippet [24.852728097115744]
Multi-person pose understanding from RGB involves three complex tasks: pose estimation, tracking and motion forecasting.
Most existing works either focus on a single task or employ multi-stage approaches to solving multiple tasks separately.
We propose Snipper, a unified framework to perform multi-person 3D pose estimation, tracking, and motion forecasting simultaneously in a single stage.
arXiv Detail & Related papers (2022-07-09T18:42:14Z) - Multi-View Depth Estimation by Fusing Single-View Depth Probability with
Multi-View Geometry [25.003116148843525]
We propose MaGNet, a framework for fusing single-view depth probability with multi-view geometry.
MaGNet achieves state-of-the-art performance on ScanNet, 7-Scenes and KITTI.
arXiv Detail & Related papers (2021-12-15T14:56:53Z) - Multi-Domain Adversarial Feature Generalization for Person
Re-Identification [52.835955258959785]
We propose a multi-dataset feature generalization network (MMFA-AAE)
It is capable of learning a universal domain-invariant feature representation from multiple labeled datasets and generalizing it to unseen' camera systems.
It also surpasses many state-of-the-art supervised methods and unsupervised domain adaptation methods by a large margin.
arXiv Detail & Related papers (2020-11-25T08:03:15Z) - Generalized Iris Presentation Attack Detection Algorithm under
Cross-Database Settings [63.90855798947425]
Presentation attacks pose major challenges to most of the biometric modalities.
We propose a generalized deep learning-based presentation attack detection network, MVANet.
It is inspired by the simplicity and success of hybrid algorithm or fusion of multiple detection networks.
arXiv Detail & Related papers (2020-10-25T22:42:27Z) - Deep Learning for Person Re-identification: A Survey and Outlook [233.36948173686602]
Person re-identification (Re-ID) aims at retrieving a person of interest across multiple non-overlapping cameras.
By dissecting the involved components in developing a person Re-ID system, we categorize it into the closed-world and open-world settings.
arXiv Detail & Related papers (2020-01-13T12:49:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.