AniGaussian: Animatable Gaussian Avatar with Pose-guided Deformation
- URL: http://arxiv.org/abs/2502.19441v1
- Date: Mon, 24 Feb 2025 06:53:37 GMT
- Title: AniGaussian: Animatable Gaussian Avatar with Pose-guided Deformation
- Authors: Mengtian Li, Shengxiang Yao, Chen Kai, Zhifeng Xie, Keyu Chen, Yu-Gang Jiang,
- Abstract summary: We introduce an innovative pose guided deformation strategy that constrains the dynamic Gaussian avatar with SMPL pose guidance.<n>We incorporate rigid-based priors from previous works to enhance the dynamic transform capabilities of the Gaussian model.<n>Through extensive comparisons with existing methods, AniGaussian demonstrates superior performance in both qualitative result and quantitative metrics.
- Score: 51.61117351997808
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Recent advancements in Gaussian-based human body reconstruction have achieved notable success in creating animatable avatars. However, there are ongoing challenges to fully exploit the SMPL model's prior knowledge and enhance the visual fidelity of these models to achieve more refined avatar reconstructions. In this paper, we introduce AniGaussian which addresses the above issues with two insights. First, we propose an innovative pose guided deformation strategy that effectively constrains the dynamic Gaussian avatar with SMPL pose guidance, ensuring that the reconstructed model not only captures the detailed surface nuances but also maintains anatomical correctness across a wide range of motions. Second, we tackle the expressiveness limitations of Gaussian models in representing dynamic human bodies. We incorporate rigid-based priors from previous works to enhance the dynamic transform capabilities of the Gaussian model. Furthermore, we introduce a split-with-scale strategy that significantly improves geometry quality. The ablative study experiment demonstrates the effectiveness of our innovative model design. Through extensive comparisons with existing methods, AniGaussian demonstrates superior performance in both qualitative result and quantitative metrics.
Related papers
- EigenGS Representation: From Eigenspace to Gaussian Image Space [20.454762899389358]
EigenGS is an efficient transformation pipeline connecting eigenspace and image-space Gaussian representations.
We show that EigenGS achieves superior reconstruction quality compared to direct 2D Gaussian fitting.
The results highlight EigenGS's effectiveness and generalization ability across images with varying resolutions and diverse categories.
arXiv Detail & Related papers (2025-03-10T15:27:03Z) - Multi-Head Attention Driven Dynamic Visual-Semantic Embedding for Enhanced Image-Text Matching [0.8611782340880084]
This study proposes an innovative visual semantic embedding model, Multi-Headed Consensus-Aware Visual-Semantic Embedding (MH-CVSE)<n>This model introduces a multi-head self-attention mechanism based on the consensus-aware visual semantic embedding model (CVSE) to capture information in multiple subspaces in parallel.<n>In terms of loss function design, the MH-CVSE model adopts a dynamic weight adjustment strategy to dynamically adjust the weight according to the loss value itself.
arXiv Detail & Related papers (2024-12-26T11:46:22Z) - Oscillation Inversion: Understand the structure of Large Flow Model through the Lens of Inversion Method [60.88467353578118]
We show that a fixed-point-inspired iterative approach to invert real-world images does not achieve convergence, instead oscillating between distinct clusters.
We introduce a simple and fast distribution transfer technique that facilitates image enhancement, stroke-based recoloring, as well as visual prompt-guided image editing.
arXiv Detail & Related papers (2024-11-17T17:45:37Z) - Motion-aware 3D Gaussian Splatting for Efficient Dynamic Scene Reconstruction [89.53963284958037]
We propose a novel motion-aware enhancement framework for dynamic scene reconstruction.
Specifically, we first establish a correspondence between 3D Gaussian movements and pixel-level flow.
For the prevalent deformation-based paradigm that presents a harder optimization problem, a transient-aware deformation auxiliary module is proposed.
arXiv Detail & Related papers (2024-03-18T03:46:26Z) - GVA: Reconstructing Vivid 3D Gaussian Avatars from Monocular Videos [56.40776739573832]
We present a novel method that facilitates the creation of vivid 3D Gaussian avatars from monocular video inputs (GVA)
Our innovation lies in addressing the intricate challenges of delivering high-fidelity human body reconstructions.
We introduce a pose refinement technique to improve hand and foot pose accuracy by aligning normal maps and silhouettes.
arXiv Detail & Related papers (2024-02-26T14:40:15Z) - GaussianBody: Clothed Human Reconstruction via 3d Gaussian Splatting [14.937297984020821]
We propose a novel clothed human reconstruction method called GaussianBody, based on 3D Gaussian Splatting.
Applying the static 3D Gaussian Splatting model to the dynamic human reconstruction problem is non-trivial due to complicated non-rigid deformations and rich cloth details.
We show that our method can achieve state-of-the-art photorealistic novel-view rendering results with high-quality details for dynamic clothed human bodies.
arXiv Detail & Related papers (2024-01-18T04:48:13Z) - Neural Parametric Gaussians for Monocular Non-Rigid Object Reconstruction [8.260048622127913]
Reconstructing dynamic objects from monocular videos is a severely underconstrained and challenging problem.
We introduce Neural Parametric Gaussians (NPGs) to take on this challenge by imposing a two-stage approach.
NPGs achieve superior results compared to previous works, especially in challenging scenarios with few multi-view cues.
arXiv Detail & Related papers (2023-12-02T18:06:24Z) - IRGen: Generative Modeling for Image Retrieval [82.62022344988993]
In this paper, we present a novel methodology, reframing image retrieval as a variant of generative modeling.
We develop our model, dubbed IRGen, to address the technical challenge of converting an image into a concise sequence of semantic units.
Our model achieves state-of-the-art performance on three widely-used image retrieval benchmarks and two million-scale datasets.
arXiv Detail & Related papers (2023-03-17T17:07:36Z) - ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object
Manipulation [135.10594078615952]
We introduce ACID, an action-conditional visual dynamics model for volumetric deformable objects.
A benchmark contains over 17,000 action trajectories with six types of plush toys and 78 variants.
Our model achieves the best performance in geometry, correspondence, and dynamics predictions.
arXiv Detail & Related papers (2022-03-14T04:56:55Z) - A Generic Approach for Enhancing GANs by Regularized Latent Optimization [79.00740660219256]
We introduce a generic framework called em generative-model inference that is capable of enhancing pre-trained GANs effectively and seamlessly.
Our basic idea is to efficiently infer the optimal latent distribution for the given requirements using Wasserstein gradient flow techniques.
arXiv Detail & Related papers (2021-12-07T05:22:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.