ADHMR: Aligning Diffusion-based Human Mesh Recovery via Direct Preference Optimization
- URL: http://arxiv.org/abs/2505.10250v2
- Date: Thu, 22 May 2025 01:53:53 GMT
- Title: ADHMR: Aligning Diffusion-based Human Mesh Recovery via Direct Preference Optimization
- Authors: Wenhao Shen, Wanqi Yin, Xiaofeng Yang, Cheng Chen, Chaoyue Song, Zhongang Cai, Lei Yang, Hao Wang, Guosheng Lin,
- Abstract summary: We propose ADHMR, a framework that Aligns a Diffusion-based HMR model in a preference optimization manner.<n>First, we train a human mesh prediction assessment model, HMR-Scorer, capable of evaluating predictions even for in-the-wild images without 3D annotations.<n>We then use HMR-Scorer to create a preference dataset, where each input image has a pair of winner and loser mesh predictions.
- Score: 51.904899019761594
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human mesh recovery (HMR) from a single image is inherently ill-posed due to depth ambiguity and occlusions. Probabilistic methods have tried to solve this by generating numerous plausible 3D human mesh predictions, but they often exhibit misalignment with 2D image observations and weak robustness to in-the-wild images. To address these issues, we propose ADHMR, a framework that Aligns a Diffusion-based HMR model in a preference optimization manner. First, we train a human mesh prediction assessment model, HMR-Scorer, capable of evaluating predictions even for in-the-wild images without 3D annotations. We then use HMR-Scorer to create a preference dataset, where each input image has a pair of winner and loser mesh predictions. This dataset is used to finetune the base model using direct preference optimization. Moreover, HMR-Scorer also helps improve existing HMR models by data cleaning, even with fewer training samples. Extensive experiments show that ADHMR outperforms current state-of-the-art methods. Code is available at: https://github.com/shenwenhao01/ADHMR.
Related papers
- Reconstructing Humans with a Biomechanically Accurate Skeleton [55.06027148976482]
We introduce a method for reconstructing 3D humans from a single image using a biomechanically accurate skeleton model.<n>Compared to state-of-the-art methods for 3D human mesh recovery, our model achieves competitive performance on standard benchmarks.
arXiv Detail & Related papers (2025-03-27T17:56:24Z) - GenHMR: Generative Human Mesh Recovery [14.708444067294325]
GenHMR is a novel generative framework that reformulates monocular HMR as an image-conditioned generative task.<n>Experiments on benchmark datasets demonstrate that GenHMR significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2024-12-19T01:45:58Z) - MEGA: Masked Generative Autoencoder for Human Mesh Recovery [33.26995842920877]
Human Mesh Recovery from a single RGB image is a highly ambiguous problem.<n>Most HMR methods overlook this issue and make a single prediction without accounting for this ambiguity.<n>This work proposes a new approach based on masked generative modeling.
arXiv Detail & Related papers (2024-05-29T07:40:31Z) - Score-Guided Diffusion for 3D Human Recovery [10.562998991986102]
We present Score-Guided Human Mesh Recovery (ScoreHMR), an approach for solving inverse problems for 3D human pose and shape reconstruction.
ScoreHMR mimics model fitting approaches, but alignment with the image observation is achieved through score guidance in the latent space of a diffusion model.
We evaluate our approach on three settings/applications: (i) single-frame model fitting; (ii) reconstruction from multiple uncalibrated views; (iii) reconstructing humans in video sequences.
arXiv Detail & Related papers (2024-03-14T17:56:14Z) - Ambient Diffusion Posterior Sampling: Solving Inverse Problems with Diffusion Models Trained on Corrupted Data [54.09959775518994]
We provide a framework for solving inverse problems with diffusion models learned from linearly corrupted data.<n>We train diffusion models for MRI with access only to subsampled multi-coil measurements at acceleration factors R= 2,4,6,8.<n>For MRI reconstruction in high acceleration regimes, we observe that A-DPS models trained on subsampled data are better suited to solving inverse problems than models trained on fully sampled data.
arXiv Detail & Related papers (2024-03-13T17:28:20Z) - Generative Approach for Probabilistic Human Mesh Recovery using
Diffusion Models [33.2565018922113]
This work focuses on the problem of reconstructing a 3D human body mesh from a given 2D image.
We propose a generative approach framework, called "Diffusion-based Human Mesh Recovery (Diff-HMR)"
arXiv Detail & Related papers (2023-08-05T22:23:04Z) - Probabilistic Modeling for Human Mesh Recovery [73.11532990173441]
This paper focuses on the problem of 3D human reconstruction from 2D evidence.
We recast the problem as learning a mapping from the input to a distribution of plausible 3D poses.
arXiv Detail & Related papers (2021-08-26T17:55:11Z) - Probabilistic 3D surface reconstruction from sparse MRI information [58.14653650521129]
We present a novel probabilistic deep learning approach for concurrent 3D surface reconstruction from sparse 2D MR image data and aleatoric uncertainty prediction.
Our method is capable of reconstructing large surface meshes from three quasi-orthogonal MR imaging slices from limited training sets.
arXiv Detail & Related papers (2020-10-05T14:18:52Z) - I2L-MeshNet: Image-to-Lixel Prediction Network for Accurate 3D Human
Pose and Mesh Estimation from a Single RGB Image [79.040930290399]
We propose I2L-MeshNet, an image-to-lixel (line+pixel) prediction network.
The proposed I2L-MeshNet predicts the per-lixel likelihood on 1D heatmaps for each mesh coordinate instead of directly regressing the parameters.
Our lixel-based 1D heatmap preserves the spatial relationship in the input image and models the prediction uncertainty.
arXiv Detail & Related papers (2020-08-09T12:13:31Z) - RAIN: A Simple Approach for Robust and Accurate Image Classification
Networks [156.09526491791772]
It has been shown that the majority of existing adversarial defense methods achieve robustness at the cost of sacrificing prediction accuracy.
This paper proposes a novel preprocessing framework, which we term Robust and Accurate Image classificatioN(RAIN)
RAIN applies randomization over inputs to break the ties between the model forward prediction path and the backward gradient path, thus improving the model robustness.
We conduct extensive experiments on the STL10 and ImageNet datasets to verify the effectiveness of RAIN against various types of adversarial attacks.
arXiv Detail & Related papers (2020-04-24T02:03:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.