Related papers: Learnable SMPLify: A Neural Solution for Optimization-Free Human Pose Inverse Kinematics

Learnable SMPLify: A Neural Solution for Optimization-Free Human Pose Inverse Kinematics

URL: http://arxiv.org/abs/2508.13562v1
Date: Tue, 19 Aug 2025 06:53:57 GMT
Title: Learnable SMPLify: A Neural Solution for Optimization-Free Human Pose Inverse Kinematics
Authors: Yuchen Yang, Linfeng Dong, Wei Wang, Zhihang Zhong, Xiao Sun,
Abstract summary: Learnable SMPLify is a neural framework that replaces the iterative fitting process in SMPLify with a single-pass regression model.<n>It achieves nearly 200x faster runtime compared to SMPLify, generalizes well to unseen 3DPW and RICH, and operates as a model-agnostic manner when used as a plug-in tool on LucidAction.
Score: 13.621560002904873
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In 3D human pose and shape estimation, SMPLify remains a robust baseline that solves inverse kinematics (IK) through iterative optimization. However, its high computational cost limits its practicality. Recent advances across domains have shown that replacing iterative optimization with data-driven neural networks can achieve significant runtime improvements without sacrificing accuracy. Motivated by this trend, we propose Learnable SMPLify, a neural framework that replaces the iterative fitting process in SMPLify with a single-pass regression model. The design of our framework targets two core challenges in neural IK: data construction and generalization. To enable effective training, we propose a temporal sampling strategy that constructs initialization-target pairs from sequential frames. To improve generalization across diverse motions and unseen poses, we propose a human-centric normalization scheme and residual learning to narrow the solution space. Learnable SMPLify supports both sequential inference and plug-in post-processing to refine existing image-based estimators. Extensive experiments demonstrate that our method establishes itself as a practical and simple baseline: it achieves nearly 200x faster runtime compared to SMPLify, generalizes well to unseen 3DPW and RICH, and operates in a model-agnostic manner when used as a plug-in tool on LucidAction. The code is available at https://github.com/Charrrrrlie/Learnable-SMPLify.

Related papers

Tail-Aware Post-Training Quantization for 3D Geometry Models [58.79500829118265]
Post-Training Quantization (PTQ) enables efficient inference without retraining.<n>PTQ fails to transfer effectively to 3D models due to intricate feature distributions and prohibitive calibration overhead.<n>We propose TAPTQ, a Tail-Aware Post-Training Quantization pipeline for 3D geometric learning.
arXiv Detail & Related papers (2026-02-02T07:21:15Z)
Adaptive 3D Reconstruction via Diffusion Priors and Forward Curvature-Matching Likelihood Updates [1.2425910171551517]
Reconstructing high-quality point clouds from images remains challenging in computer vision.<n>Recent diffusion-based methods have attempted to address this by combining prior models with likelihood updates.<n>We advance this line of approach by integrating our novel Forward Curvature-Matching (FCM) update method with diffusion sampling.
arXiv Detail & Related papers (2025-11-09T10:14:14Z)
EA4LLM: A Gradient-Free Approach to Large Language Model Optimization via Evolutionary Algorithms [23.009274904878065]
We propose EA4LLM, an evolutionary algorithm for optimizing large language models (LLMs)<n>We empirically verify full- parameter optimization from the pretraining stage across model sizes ranging from 0.5B to 32B.<n>Our work challenges the prevailing assumption that gradient-based optimization is the only viable approach for training neural networks.
arXiv Detail & Related papers (2025-10-12T13:38:28Z)
VoxelOpt: Voxel-Adaptive Message Passing for Discrete Optimization in Deformable Abdominal CT Registration [15.78340001680369]
We propose VoxelOpt, a discrete optimization-based deformable image registration framework.<n>It combines the strengths of learning-based and iterative methods to achieve a better balance between registration accuracy and runtime.<n>In abdominal CT registration, these changes allow VoxelOpt to outperform leading iterative in both efficiency and accuracy, while matching state-of-the-art learning-based methods trained with label supervision.
arXiv Detail & Related papers (2025-06-24T19:44:04Z)
End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures. We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z)
Learning Large-scale Neural Fields via Context Pruned Meta-Learning [60.93679437452872]
We introduce an efficient optimization-based meta-learning technique for large-scale neural field training. We show how gradient re-scaling at meta-test time allows the learning of extremely high-quality neural fields. Our framework is model-agnostic, intuitive, straightforward to implement, and shows significant reconstruction improvements for a wide range of signals.
arXiv Detail & Related papers (2023-02-01T17:32:16Z)
Back to MLP: A Simple Baseline for Human Motion Prediction [59.18776744541904]
This paper tackles the problem of human motion prediction, consisting in forecasting future body poses from historically observed sequences. We show that the performance of these approaches can be surpassed by a light-weight and purely architectural architecture with only 0.14M parameters. An exhaustive evaluation on Human3.6M, AMASS and 3DPW datasets shows that our method, which we dub siMLPe, consistently outperforms all other approaches.
arXiv Detail & Related papers (2022-07-04T16:35:58Z)
Learned Vertex Descent: A New Direction for 3D Human Model Fitting [64.04726230507258]
We propose a novel optimization-based paradigm for 3D human model fitting on images and scans. Our approach is able to capture the underlying body of clothed people with very different body shapes, achieving a significant improvement compared to state-of-the-art. LVD is also applicable to 3D model fitting of humans and hands, for which we show a significant improvement to the SOTA with a much simpler and faster method.
arXiv Detail & Related papers (2022-05-12T17:55:51Z)
Distribution-Aware Single-Stage Models for Multi-Person 3D Pose Estimation [29.430404703883084]
We present a novel Distribution-Aware Single-stage (DAS) model for tackling the challenging multi-person 3D pose estimation problem. The proposed DAS model simultaneously localizes person positions and their corresponding body joints in the 3D camera space in a one-pass manner. Comprehensive experiments on benchmarks CMU Panoptic and MuPoTS-3D demonstrate the superior efficiency of the proposed DAS model.
arXiv Detail & Related papers (2022-03-15T07:30:27Z)
RL-PGO: Reinforcement Learning-based Planar Pose-Graph Optimization [1.4884785898657995]
This paper presents a state-of-the-art Deep Reinforcement Learning (DRL) based environment and proposed agent for 2D pose-graph optimization. We demonstrate that the pose-graph optimization problem can be modeled as a partially observable Decision Process and evaluate performance on real-world and synthetic datasets.
arXiv Detail & Related papers (2022-02-26T20:10:14Z)
Dynamic Iterative Refinement for Efficient 3D Hand Pose Estimation [87.54604263202941]
We propose a tiny deep neural network of which partial layers are iteratively exploited for refining its previous estimations. We employ learned gating criteria to decide whether to exit from the weight-sharing loop, allowing per-sample adaptation in our model. Our method consistently outperforms state-of-the-art 2D/3D hand pose estimation approaches in terms of both accuracy and efficiency for widely used benchmarks.
arXiv Detail & Related papers (2021-11-11T23:31:34Z)
Human Body Model Fitting by Learned Gradient Descent [48.79414884222403]
We propose a novel algorithm for the fitting of 3D human shape to images. We show that this algorithm is fast (avg. 120ms convergence), robust to dataset, and achieves state-of-the-art results on public evaluation datasets.
arXiv Detail & Related papers (2020-08-19T14:26:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.