Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online
Adaptation
- URL: http://arxiv.org/abs/2111.04017v1
- Date: Sun, 7 Nov 2021 07:23:24 GMT
- Title: Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online
Adaptation
- Authors: Shanyan Guan, Jingwei Xu, Michelle Z. He, Yunbo Wang, Bingbing Ni,
Xiaokang Yang
- Abstract summary: We consider a new problem of adapting a human mesh reconstruction model to out-of-domain streaming videos.
We tackle this problem through online adaptation, gradually correcting the model bias during testing.
We propose the Dynamic Bilevel Online Adaptation algorithm (DynaBOA)
- Score: 87.85851771425325
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider a new problem of adapting a human mesh reconstruction model to
out-of-domain streaming videos, where performance of existing SMPL-based models
are significantly affected by the distribution shift represented by different
camera parameters, bone lengths, backgrounds, and occlusions. We tackle this
problem through online adaptation, gradually correcting the model bias during
testing. There are two main challenges: First, the lack of 3D annotations
increases the training difficulty and results in 3D ambiguities. Second,
non-stationary data distribution makes it difficult to strike a balance between
fitting regular frames and hard samples with severe occlusions or dramatic
changes. To this end, we propose the Dynamic Bilevel Online Adaptation
algorithm (DynaBOA). It first introduces the temporal constraints to compensate
for the unavailable 3D annotations, and leverages a bilevel optimization
procedure to address the conflicts between multi-objectives. DynaBOA provides
additional 3D guidance by co-training with similar source examples retrieved
efficiently despite the distribution shift. Furthermore, it can adaptively
adjust the number of optimization steps on individual frames to fully fit hard
samples and avoid overfitting regular frames. DynaBOA achieves state-of-the-art
results on three out-of-domain human mesh reconstruction benchmarks.
Related papers
- ALOcc: Adaptive Lifting-based 3D Semantic Occupancy and Cost Volume-based Flow Prediction [89.89610257714006]
Existing methods prioritize higher accuracy to cater to the demands of these tasks.
We introduce a series of targeted improvements for 3D semantic occupancy prediction and flow estimation.
Our purelytemporalal architecture framework, named ALOcc, achieves an optimal tradeoff between speed and accuracy.
arXiv Detail & Related papers (2024-11-12T11:32:56Z) - SD-MVS: Segmentation-Driven Deformation Multi-View Stereo with Spherical
Refinement and EM optimization [6.886220026399106]
We introduce Multi-View Stereo (SD-MVS) to tackle challenges in 3D reconstruction of textureless areas.
We are the first to adopt the Segment Anything Model (SAM) to distinguish semantic instances in scenes.
We propose a unique refinement strategy that combines spherical coordinates and gradient descent on normals and pixelwise search interval on depths.
arXiv Detail & Related papers (2024-01-12T05:25:57Z) - Progressive Multi-view Human Mesh Recovery with Self-Supervision [68.60019434498703]
Existing solutions typically suffer from poor generalization performance to new settings.
We propose a novel simulation-based training pipeline for multi-view human mesh recovery.
arXiv Detail & Related papers (2022-12-10T06:28:29Z) - Self-supervised Human Mesh Recovery with Cross-Representation Alignment [20.69546341109787]
Self-supervised human mesh recovery methods have poor generalizability due to limited availability and diversity of 3D-annotated benchmark datasets.
We propose cross-representation alignment utilizing the complementary information from the robust but sparse representation (2D keypoints)
This adaptive cross-representation alignment explicitly learns from the deviations and captures complementary information: richness from sparse representation and robustness from dense representation.
arXiv Detail & Related papers (2022-09-10T04:47:20Z) - Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose
Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations.
We derive suitable measures to quantify prediction uncertainty at both pose and joint level.
We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z) - Distribution-Aware Single-Stage Models for Multi-Person 3D Pose
Estimation [29.430404703883084]
We present a novel Distribution-Aware Single-stage (DAS) model for tackling the challenging multi-person 3D pose estimation problem.
The proposed DAS model simultaneously localizes person positions and their corresponding body joints in the 3D camera space in a one-pass manner.
Comprehensive experiments on benchmarks CMU Panoptic and MuPoTS-3D demonstrate the superior efficiency of the proposed DAS model.
arXiv Detail & Related papers (2022-03-15T07:30:27Z) - Multi-initialization Optimization Network for Accurate 3D Human Pose and
Shape Estimation [75.44912541912252]
We propose a three-stage framework named Multi-Initialization Optimization Network (MION)
In the first stage, we strategically select different coarse 3D reconstruction candidates which are compatible with the 2D keypoints of input sample.
In the second stage, we design a mesh refinement transformer (MRT) to respectively refine each coarse reconstruction result via a self-attention mechanism.
Finally, a Consistency Estimation Network (CEN) is proposed to find the best result from mutiple candidates by evaluating if the visual evidence in RGB image matches a given 3D reconstruction.
arXiv Detail & Related papers (2021-12-24T02:43:58Z) - Bilevel Online Adaptation for Out-of-Domain Human Mesh Reconstruction [94.25865526414717]
This paper considers a new problem of adapting a pre-trained model of human mesh reconstruction to out-of-domain streaming videos.
We propose Bilevel Online Adaptation, which divides the optimization process of overall multi-objective into two steps of weight probe and weight update in a training.
We demonstrate that BOA leads to state-of-the-art results on two human mesh reconstruction benchmarks.
arXiv Detail & Related papers (2021-03-30T15:47:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.