Recovering 3D Human Mesh from Monocular Images: A Survey
- URL: http://arxiv.org/abs/2203.01923v6
- Date: Tue, 2 Jan 2024 15:38:20 GMT
- Title: Recovering 3D Human Mesh from Monocular Images: A Survey
- Authors: Yating Tian, Hongwen Zhang, Yebin Liu, Limin Wang
- Abstract summary: Estimating human pose and shape from monocular images is a long-standing problem in computer vision.
This survey focuses on the task of monocular 3D human mesh recovery.
- Score: 49.00136388529404
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Estimating human pose and shape from monocular images is a long-standing
problem in computer vision. Since the release of statistical body models, 3D
human mesh recovery has been drawing broader attention. With the same goal of
obtaining well-aligned and physically plausible mesh results, two paradigms
have been developed to overcome challenges in the 2D-to-3D lifting process: i)
an optimization-based paradigm, where different data terms and regularization
terms are exploited as optimization objectives; and ii) a regression-based
paradigm, where deep learning techniques are embraced to solve the problem in
an end-to-end fashion. Meanwhile, continuous efforts are devoted to improving
the quality of 3D mesh labels for a wide range of datasets. Though remarkable
progress has been achieved in the past decade, the task is still challenging
due to flexible body motions, diverse appearances, complex environments, and
insufficient in-the-wild annotations. To the best of our knowledge, this is the
first survey that focuses on the task of monocular 3D human mesh recovery. We
start with the introduction of body models and then elaborate recovery
frameworks and training objectives by providing in-depth analyses of their
strengths and weaknesses. We also summarize datasets, evaluation metrics, and
benchmark results. Open issues and future directions are discussed in the end,
hoping to motivate researchers and facilitate their research in this area. A
regularly updated project page can be found at
https://github.com/tinatiansjz/hmr-survey.
Related papers
- Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - Hybrid 3D Human Pose Estimation with Monocular Video and Sparse IMUs [15.017274891943162]
Temporal 3D human pose estimation from monocular videos is a challenging task in human-centered computer vision.
Inertial sensor has been introduced to provide complementary source of information.
It remains challenging to integrate heterogeneous sensor data for producing physically rational 3D human poses.
arXiv Detail & Related papers (2024-04-27T09:02:42Z) - Retrieval-Augmented Score Distillation for Text-to-3D Generation [30.57225047257049]
We introduce novel framework for retrieval-based quality enhancement in text-to-3D generation.
We conduct extensive experiments to demonstrate that ReDream exhibits superior quality with increased geometric consistency.
arXiv Detail & Related papers (2024-02-05T12:50:30Z) - Deep Generative Models on 3D Representations: A Survey [81.73385191402419]
Generative models aim to learn the distribution of observed data by generating new instances.
Recently, researchers started to shift focus from 2D to 3D space.
representing 3D data poses significantly greater challenges.
arXiv Detail & Related papers (2022-10-27T17:59:50Z) - Recent Advances in Monocular 2D and 3D Human Pose Estimation: A Deep
Learning Perspective [69.44384540002358]
We provide a comprehensive and holistic 2D-to-3D perspective to tackle this problem.
We categorize the mainstream and milestone approaches since the year 2014 under unified frameworks.
We also summarize the pose representation styles, benchmarks, evaluation metrics, and the quantitative performance of popular approaches.
arXiv Detail & Related papers (2021-04-23T11:07:07Z) - Secrets of 3D Implicit Object Shape Reconstruction in the Wild [92.5554695397653]
Reconstructing high-fidelity 3D objects from sparse, partial observation is crucial for various applications in computer vision, robotics, and graphics.
Recent neural implicit modeling methods show promising results on synthetic or dense datasets.
But, they perform poorly on real-world data that is sparse and noisy.
This paper analyzes the root cause of such deficient performance of a popular neural implicit model.
arXiv Detail & Related papers (2021-01-18T03:24:48Z) - Deep Learning-Based Human Pose Estimation: A Survey [66.01917727294163]
Human pose estimation has drawn increasing attention during the past decade.
It has been utilized in a wide range of applications including human-computer interaction, motion analysis, augmented reality, and virtual reality.
Recent deep learning-based solutions have achieved high performance in human pose estimation.
arXiv Detail & Related papers (2020-12-24T18:49:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.