AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in
the Wild
- URL: http://arxiv.org/abs/2010.13302v1
- Date: Mon, 26 Oct 2020 03:19:46 GMT
- Title: AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in
the Wild
- Authors: Zhe Zhang, Chunyu Wang, Weichao Qiu, Wenhu Qin, Wenjun Zeng
- Abstract summary: We present AdaFuse, an adaptive multiview fusion method to enhance the features in occluded views.
We extensively evaluate the approach on three public datasets including Human3.6M, Total Capture and CMU Panoptic.
We also create a large scale synthetic dataset Occlusion-Person, which allows us to perform numerical evaluation on the occluded joints.
- Score: 77.43884383743872
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Occlusion is probably the biggest challenge for human pose estimation in the
wild. Typical solutions often rely on intrusive sensors such as IMUs to detect
occluded joints. To make the task truly unconstrained, we present AdaFuse, an
adaptive multiview fusion method, which can enhance the features in occluded
views by leveraging those in visible views. The core of AdaFuse is to determine
the point-point correspondence between two views which we solve effectively by
exploring the sparsity of the heatmap representation. We also learn an adaptive
fusion weight for each camera view to reflect its feature quality in order to
reduce the chance that good features are undesirably corrupted by ``bad''
views. The fusion model is trained end-to-end with the pose estimation network,
and can be directly applied to new camera configurations without additional
adaptation. We extensively evaluate the approach on three public datasets
including Human3.6M, Total Capture and CMU Panoptic. It outperforms the
state-of-the-arts on all of them. We also create a large scale synthetic
dataset Occlusion-Person, which allows us to perform numerical evaluation on
the occluded joints, as it provides occlusion labels for every joint in the
images. The dataset and code are released at
https://github.com/zhezh/adafuse-3d-human-pose.
Related papers
- Enhancing 3D Human Pose Estimation Amidst Severe Occlusion with Dual Transformer Fusion [13.938406073551844]
This paper introduces a Dual Transformer Fusion (DTF) algorithm, a novel approach to obtain a holistic 3D pose estimation.
To enable precise 3D Human Pose Estimation, our approach leverages the innovative DTF architecture, which first generates a pair of intermediate views.
Our approach outperforms existing state-of-the-art methods on both datasets, yielding substantial improvements.
arXiv Detail & Related papers (2024-10-06T18:15:27Z) - Occ$^2$Net: Robust Image Matching Based on 3D Occupancy Estimation for
Occluded Regions [14.217367037250296]
Occ$2$Net is an image matching method that models occlusion relations using 3D occupancy and infers matching points in occluded regions.
We evaluate our method on both real-world and simulated datasets and demonstrate its superior performance over state-of-the-art methods on several metrics.
arXiv Detail & Related papers (2023-08-14T13:09:41Z) - AdaptivePose++: A Powerful Single-Stage Network for Multi-Person Pose
Regression [66.39539141222524]
We propose to represent the human parts as adaptive points and introduce a fine-grained body representation method.
With the proposed body representation, we deliver a compact single-stage multi-person pose regression network, termed as AdaptivePose.
We employ AdaptivePose for both 2D/3D multi-person pose estimation tasks to verify the effectiveness of AdaptivePose.
arXiv Detail & Related papers (2022-10-08T12:54:20Z) - FusePose: IMU-Vision Sensor Fusion in Kinematic Space for Parametric
Human Pose Estimation [12.821740951249552]
We propose a framework called emphFusePose under a parametric human kinematic model.
We aggregate different information of IMU or vision data and introduce three distinctive sensor fusion approaches: NaiveFuse, KineFuse and AdaDeepFuse.
The performance of 3D human pose estimation is improved compared to the baseline result.
arXiv Detail & Related papers (2022-08-25T09:35:27Z) - Explicit Occlusion Reasoning for Multi-person 3D Human Pose Estimation [33.86986028882488]
Occlusion poses a great threat to monocular multi-person 3D human pose estimation due to large variability in terms of the shape, appearance, and position of occluders.
Existing methods try to handle occlusion with pose priors/constraints, data augmentation, or implicit reasoning.
We develop a method to explicitly model this process that significantly improves bottom-up multi-person human pose estimation.
arXiv Detail & Related papers (2022-07-29T22:12:50Z) - Non-Local Latent Relation Distillation for Self-Adaptive 3D Human Pose
Estimation [63.199549837604444]
3D human pose estimation approaches leverage different forms of strong (2D/3D pose) or weak (multi-view or depth) paired supervision.
We cast 3D pose learning as a self-supervised adaptation problem that aims to transfer the task knowledge from a labeled source domain to a completely unpaired target.
We evaluate different self-adaptation settings and demonstrate state-of-the-art 3D human pose estimation performance on standard benchmarks.
arXiv Detail & Related papers (2022-04-05T03:52:57Z) - Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose
Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations.
We derive suitable measures to quantify prediction uncertainty at both pose and joint level.
We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z) - Occlusion-Invariant Rotation-Equivariant Semi-Supervised Depth Based
Cross-View Gait Pose Estimation [40.50555832966361]
We propose a novel approach for cross-view generalization with an occlusion-invariant semi-supervised learning framework.
Our model was trained with real-world data from a single view and unlabelled synthetic data from multiple views.
It can generalize well on the real-world data from all the other unseen views.
arXiv Detail & Related papers (2021-09-03T09:39:05Z) - Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo [71.59494156155309]
Existing approaches for multi-view 3D pose estimation explicitly establish cross-view correspondences to group 2D pose detections from multiple camera views.
We present our multi-view 3D pose estimation approach based on plane sweep stereo to jointly address the cross-view fusion and 3D pose reconstruction in a single shot.
arXiv Detail & Related papers (2021-04-06T03:49:35Z) - High-Order Information Matters: Learning Relation and Topology for
Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment.
Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.