Related papers: Human Body Model Fitting by Learned Gradient Descent

Human Body Model Fitting by Learned Gradient Descent

URL: http://arxiv.org/abs/2008.08474v1
Date: Wed, 19 Aug 2020 14:26:47 GMT
Title: Human Body Model Fitting by Learned Gradient Descent
Authors: Jie Song, Xu Chen, Otmar Hilliges
Abstract summary: We propose a novel algorithm for the fitting of 3D human shape to images. We show that this algorithm is fast (avg. 120ms convergence), robust to dataset, and achieves state-of-the-art results on public evaluation datasets.
Score: 48.79414884222403
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a novel algorithm for the fitting of 3D human shape to images. Combining the accuracy and refinement capabilities of iterative gradient-based optimization techniques with the robustness of deep neural networks, we propose a gradient descent algorithm that leverages a neural network to predict the parameter update rule for each iteration. This per-parameter and state-aware update guides the optimizer towards a good solution in very few steps, converging in typically few steps. During training our approach only requires MoCap data of human poses, parametrized via SMPL. From this data the network learns a subspace of valid poses and shapes in which optimization is performed much more efficiently. The approach does not require any hard to acquire image-to-3D correspondences. At test time we only optimize the 2D joint re-projection error without the need for any further priors or regularization terms. We show empirically that this algorithm is fast (avg. 120ms convergence), robust to initialization and dataset, and achieves state-of-the-art results on public evaluation datasets including the challenging 3DPW in-the-wild benchmark (improvement over SMPLify 45%) and also approaches using image-to-3D correspondences

Related papers

Pseudo Depth Meets Gaussian: A Feed-forward RGB SLAM Baseline [64.42938561167402]
We propose an online 3D reconstruction method using 3D Gaussian-based SLAM, combined with a feed-forward recurrent prediction module.<n>This approach replaces slow test-time optimization with fast network inference, significantly improving tracking speed.<n>Our method achieves performance on par with the state-of-the-art SplaTAM, while reducing tracking time by more than 90%.
arXiv Detail & Related papers (2025-08-06T16:16:58Z)
A Construct-Optimize Approach to Sparse View Synthesis without Camera Pose [44.13819148680788]
We develop a novel construct-and-optimize method for sparse view synthesis without camera poses. Specifically, we construct a solution by using monocular depth and projecting pixels back into the 3D world. We demonstrate results on the Tanks and Temples and Static Hikes datasets with as few as three widely-spaced views.
arXiv Detail & Related papers (2024-05-06T17:36:44Z)
Uncertainty-Aware Testing-Time Optimization for 3D Human Pose Estimation [65.91490997921859]
We propose an Uncertainty-Aware testing-time Optimization (UAO) framework for 3D human pose estimation.<n>The framework keeps the prior information of the pre-trained model and alleviates the overfitting problem using the uncertainty of joints.<n>Our approach outperforms the previous best result by a large margin of 5.5% on Human3.6M.
arXiv Detail & Related papers (2024-02-04T04:28:02Z)
Learned Vertex Descent: A New Direction for 3D Human Model Fitting [64.04726230507258]
We propose a novel optimization-based paradigm for 3D human model fitting on images and scans. Our approach is able to capture the underlying body of clothed people with very different body shapes, achieving a significant improvement compared to state-of-the-art. LVD is also applicable to 3D model fitting of humans and hands, for which we show a significant improvement to the SOTA with a much simpler and faster method.
arXiv Detail & Related papers (2022-05-12T17:55:51Z)
Adversarial Parametric Pose Prior [106.12437086990853]
We learn a prior that restricts the SMPL parameters to values that produce realistic poses via adversarial training. We show that our learned prior covers the diversity of the real-data distribution, facilitates optimization for 3D reconstruction from 2D keypoints, and yields better pose estimates when used for regression from images.
arXiv Detail & Related papers (2021-12-08T10:05:32Z)
Learning to Fit Morphable Models [12.469605679847085]
We build upon recent advances in learned optimization and propose an update rule inspired by the classic Levenberg-Marquardt algorithm. We show the effectiveness of the proposed neural on the problems of 3D body surface estimation from a head-mounted device and face fitting from 2D landmarks.
arXiv Detail & Related papers (2021-11-29T18:59:53Z)
Multi-scale Neural ODEs for 3D Medical Image Registration [7.715565365558909]
Image registration plays an important role in medical image analysis. Deep learning methods such as learn-to-map are much faster but either iterative or coarse-to-fine approach is required to improve accuracy for handling large motions. In this work, we proposed to learn a registration via a multi-scale neural ODE model.
arXiv Detail & Related papers (2021-06-16T00:26:53Z)
Exploiting Adam-like Optimization Algorithms to Improve the Performance of Convolutional Neural Networks [82.61182037130405]
gradient descent (SGD) is the main approach for training deep networks. In this work, we compare Adam based variants based on the difference between the present and the past gradients. We have tested ensemble of networks and the fusion with ResNet50 trained with gradient descent.
arXiv Detail & Related papers (2021-03-26T18:55:08Z)
Displacement-Invariant Cost Computation for Efficient Stereo Matching [122.94051630000934]
Deep learning methods have dominated stereo matching leaderboards by yielding unprecedented disparity accuracy. But their inference time is typically slow, on the order of seconds for a pair of 540p images. We propose a emphdisplacement-invariant cost module to compute the matching costs without needing a 4D feature volume.
arXiv Detail & Related papers (2020-12-01T23:58:16Z)
SSP-Net: Scalable Sequential Pyramid Networks for Real-Time 3D Human Pose Regression [27.85790535227085]
We propose a highly scalable convolutional neural network, end-to-end trainable, for real-time 3D human pose regression from still RGB images. Our network requires a single training procedure and is capable of producing its best predictions at 120 frames per second.
arXiv Detail & Related papers (2020-09-04T03:43:24Z)
Neural Descent for Visual 3D Human Pose and Shape [67.01050349629053]
We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image. We rely on a recently introduced, expressivefull body statistical 3d human model, GHUM, trained end-to-end. Central to our methodology, is a learning to learn and optimize approach, referred to as HUmanNeural Descent (HUND), which avoids both second-order differentiation.
arXiv Detail & Related papers (2020-08-16T13:38:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.