Related papers: WildGHand: Learning Anti-Perturbation Gaussian Hand Avatars from Monocular In-the-Wild Videos

WildGHand: Learning Anti-Perturbation Gaussian Hand Avatars from Monocular In-the-Wild Videos

URL: http://arxiv.org/abs/2602.20556v1
Date: Tue, 24 Feb 2026 05:14:05 GMT
Title: WildGHand: Learning Anti-Perturbation Gaussian Hand Avatars from Monocular In-the-Wild Videos
Authors: Hanhui Li, Xuan Huang, Wanquan Liu, Yuhao Cheng, Long Chen, Yiqiang Yan, Xiaodan Liang, Chenqiang Gao,
Abstract summary: We introduce WildGHand, an optimization-based framework that enables self-adaptive 3D Gaussian splatting on in-the-wild videos.<n>We further curate a dataset of monocular hand videos captured under diverse perturbations to benchmark in-the-wild hand avatar reconstruction.
Score: 68.43355277637882
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite recent progress in 3D hand reconstruction from monocular videos, most existing methods rely on data captured in well-controlled environments and therefore degrade in real-world settings with severe perturbations, such as hand-object interactions, extreme poses, illumination changes, and motion blur. To tackle these issues, we introduce WildGHand, an optimization-based framework that enables self-adaptive 3D Gaussian splatting on in-the-wild videos and produces high-fidelity hand avatars. WildGHand incorporates two key components: (i) a dynamic perturbation disentanglement module that explicitly represents perturbations as time-varying biases on 3D Gaussian attributes during optimization, and (ii) a perturbation-aware optimization strategy that generates per-frame anisotropic weighted masks to guide optimization. Together, these components allow the framework to identify and suppress perturbations across both spatial and temporal dimensions. We further curate a dataset of monocular hand videos captured under diverse perturbations to benchmark in-the-wild hand avatar reconstruction. Extensive experiments on this dataset and two public datasets demonstrate that WildGHand achieves state-of-the-art performance and substantially improves over its base model across multiple metrics (e.g., up to a $15.8\%$ relative gain in PSNR and a $23.1\%$ relative reduction in LPIPS). Our implementation and dataset are available at https://github.com/XuanHuang0/WildGHand.

Related papers

ERGO: Excess-Risk-Guided Optimization for High-Fidelity Monocular 3D Gaussian Splatting [63.138778159026934]
We propose an adaptive optimization framework guided by excess risk decomposition, termed ERGO.<n> ERGO dynamically estimates the view-specific excess risk and adaptively adjust loss weights during optimization.<n>Experiments on the Google Scanned Objects dataset and the OmniObject3D dataset demonstrate the superiority of ERGO over existing state-of-the-art methods.
arXiv Detail & Related papers (2026-02-10T20:44:43Z)
JOintGS: Joint Optimization of Cameras, Bodies and 3D Gaussians for In-the-Wild Monocular Reconstruction [18.636227266388218]
We present JOintGS, a unified framework that jointly optimize camera extrinsics, human poses, and 3D Gaussian representations.<n>Experiments on NeuMan and EMDB datasets demonstrate that JOintGS achieves superior reconstruction quality.
arXiv Detail & Related papers (2026-02-04T08:33:51Z)
RobustSplat++: Decoupling Densification, Dynamics, and Illumination for In-the-Wild 3DGS [85.90134051583368]
3D Gaussian Splatting (3DGS) has gained significant attention for its real-time, photo-realistic rendering in novel-view synthesis and 3D modeling.<n>Existing methods struggle with accurately modeling in-the-wild scenes affected by transient objects and illuminations.<n>We propose RobustSplat++, a robust solution based on several critical designs.
arXiv Detail & Related papers (2025-12-04T14:05:09Z)
Diffusion-Guided Gaussian Splatting for Large-Scale Unconstrained 3D Reconstruction and Novel View Synthesis [22.767866875051013]
We propose GS-Diff, a novel 3DGS framework guided by a multi-view diffusion model to address limitations of current methods.<n>By generating pseudo-observations conditioned on multi-view inputs, our method transforms under-constrained 3D reconstruction problems into well-posed ones.<n> Experiments on four benchmarks demonstrate that GS-Diff consistently outperforms state-of-the-art baselines by significant margins.
arXiv Detail & Related papers (2025-04-02T17:59:46Z)
FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction [69.63414788486578]
FreeSplatter is a scalable feed-forward framework that generates high-quality 3D Gaussians from uncalibrated sparse-view images.<n>Our approach employs a streamlined transformer architecture where self-attention blocks facilitate information exchange.<n>We develop two specialized variants--for object-centric and scene-level reconstruction--trained on comprehensive datasets.
arXiv Detail & Related papers (2024-12-12T18:52:53Z)
WildGaussians: 3D Gaussian Splatting in the Wild [80.5209105383932]
We introduce WildGaussians, a novel approach to handle occlusions and appearance changes with 3DGS. We demonstrate that WildGaussians matches the real-time rendering speed of 3DGS while surpassing both 3DGS and NeRF baselines in handling in-the-wild data.
arXiv Detail & Related papers (2024-07-11T12:41:32Z)
Wild-GS: Real-Time Novel View Synthesis from Unconstrained Photo Collections [30.321151430263946]
This paper presents Wild-GS, an innovative adaptation of 3DGS optimized for unconstrained photo collections. Wild-GS determines the appearance of each 3D Gaussian by their inherent material attributes, global illumination and camera properties per image, and point-level local variance of reflectance. This novel design effectively transfers the high-frequency detailed appearance of the reference view to 3D space and significantly expedites the training process.
arXiv Detail & Related papers (2024-06-14T19:06:07Z)
Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation [87.85851771425325]
We consider a new problem of adapting a human mesh reconstruction model to out-of-domain streaming videos. We tackle this problem through online adaptation, gradually correcting the model bias during testing. We propose the Dynamic Bilevel Online Adaptation algorithm (DynaBOA)
arXiv Detail & Related papers (2021-11-07T07:23:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.