VoxelKeypointFusion: Generalizable Multi-View Multi-Person Pose Estimation
- URL: http://arxiv.org/abs/2410.18723v2
- Date: Wed, 13 Nov 2024 12:02:29 GMT
- Title: VoxelKeypointFusion: Generalizable Multi-View Multi-Person Pose Estimation
- Authors: Daniel Bermuth, Alexander Poeppel, Wolfgang Reif,
- Abstract summary: This work presents an evaluation of the generalization capabilities of multi-view multi-person pose estimators to unseen datasets.
It also studies the improvements by additionally using depth information.
Since the new approach can not only generalize well to unseen datasets, but also to different keypoints, the first multi-view multi-person whole-body estimator is presented.
- Score: 45.085830389820956
- License:
- Abstract: In the rapidly evolving field of computer vision, the task of accurately estimating the poses of multiple individuals from various viewpoints presents a formidable challenge, especially if the estimations should be reliable as well. This work presents an extensive evaluation of the generalization capabilities of multi-view multi-person pose estimators to unseen datasets and presents a new algorithm with strong performance in this task. It also studies the improvements by additionally using depth information. Since the new approach can not only generalize well to unseen datasets, but also to different keypoints, the first multi-view multi-person whole-body estimator is presented. To support further research on those topics, all of the work is publicly accessible.
Related papers
- Scaling Up Personalized Image Aesthetic Assessment via Task Vector Customization [37.66059382315255]
We present a unique approach that leverages readily available databases for general image aesthetic assessment and image quality assessment.
By determining optimal combinations of task vectors, known to represent specific traits of each database, we successfully create personalized models for individuals.
arXiv Detail & Related papers (2024-07-09T18:42:41Z) - You Only Learn One Query: Learning Unified Human Query for Single-Stage Multi-Person Multi-Task Human-Centric Perception [37.667147915777534]
Human-centric perception is a long-standing problem for computer vision.
This paper introduces a unified and versatile framework (HQNet) for single-stage multi-person multi-task human-centric perception (HCP)
Human Query captures intricate instance-level features for individual persons and disentangles complex multi-person scenarios.
arXiv Detail & Related papers (2023-12-09T10:36:43Z) - HaMuCo: Hand Pose Estimation via Multiview Collaborative Self-Supervised
Learning [19.432034725468217]
HaMuCo is a self-supervised learning framework that learns a single-view hand pose estimator from multi-view pseudo 2D labels.
We introduce a cross-view interaction network that distills the single-view estimator by utilizing the cross-view correlated features.
Our method can achieve state-of-the-art performance on multi-view self-supervised hand pose estimation.
arXiv Detail & Related papers (2023-02-02T10:13:04Z) - Two-level Data Augmentation for Calibrated Multi-view Detection [51.5746691103591]
We introduce a new multi-view data augmentation pipeline that preserves alignment among views.
We also propose a second level of augmentation applied directly at the scene level.
When combined with our simple multi-view detection model, our two-level augmentation pipeline outperforms all existing baselines.
arXiv Detail & Related papers (2022-10-19T17:55:13Z) - Multi-View representation learning in Multi-Task Scene [4.509968166110557]
We propose a novel semi-supervised algorithm, termed as Multi-Task Multi-View learning based on Common and Special Features (MTMVCSF)
An anti-noise multi-task multi-view algorithm called AN-MTMVCSF is proposed, which has a strong adaptability to noise labels.
The effectiveness of these algorithms is proved by a series of well-designed experiments on both real world and synthetic data.
arXiv Detail & Related papers (2022-01-15T11:26:28Z) - Uncertainty-Aware Multi-View Representation Learning [53.06828186507994]
We devise a novel unsupervised multi-view learning approach, termed as Dynamic Uncertainty-Aware Networks (DUA-Nets)
Guided by the uncertainty of data estimated from the generation perspective, intrinsic information from multiple views is integrated to obtain noise-free representations.
Our model achieves superior performance in extensive experiments and shows the robustness to noisy data.
arXiv Detail & Related papers (2022-01-15T07:16:20Z) - The Multimodal Sentiment Analysis in Car Reviews (MuSe-CaR) Dataset:
Collection, Insights and Improvements [14.707930573950787]
We present MuSe-CaR, a first of its kind multimodal dataset.
The data is publicly available as it recently served as the testing bed for the 1st Multimodal Sentiment Analysis Challenge.
arXiv Detail & Related papers (2021-01-15T10:40:37Z) - Multi-Domain Adversarial Feature Generalization for Person
Re-Identification [52.835955258959785]
We propose a multi-dataset feature generalization network (MMFA-AAE)
It is capable of learning a universal domain-invariant feature representation from multiple labeled datasets and generalizing it to unseen' camera systems.
It also surpasses many state-of-the-art supervised methods and unsupervised domain adaptation methods by a large margin.
arXiv Detail & Related papers (2020-11-25T08:03:15Z) - Multi-Task Learning for Dense Prediction Tasks: A Survey [87.66280582034838]
Multi-task learning (MTL) techniques have shown promising results w.r.t. performance, computations and/or memory footprint.
We provide a well-rounded view on state-of-the-art deep learning approaches for MTL in computer vision.
arXiv Detail & Related papers (2020-04-28T09:15:50Z) - Deep Learning for Person Re-identification: A Survey and Outlook [233.36948173686602]
Person re-identification (Re-ID) aims at retrieving a person of interest across multiple non-overlapping cameras.
By dissecting the involved components in developing a person Re-ID system, we categorize it into the closed-world and open-world settings.
arXiv Detail & Related papers (2020-01-13T12:49:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.