Related papers: Sequential Gaussian Avatars with Hierarchical Motion Context

Related papers

AGORA: Adversarial Generation Of Real-time Animatable 3D Gaussian Head Avatars [54.854597811704316]
AGORA is a novel framework that extends 3DGS within a generative adversarial network to produce animatable avatars.<n>Expression fidelity is enforced via a dual-discriminator training scheme.<n>AGORA generates avatars that are not only visually realistic but also precisely controllable.
arXiv Detail & Related papers (2025-12-06T14:05:20Z)
STG-Avatar: Animatable Human Avatars via Spacetime Gaussian [14.962899842675304]
We present STG-Avatar, a 3DGS-based framework for high-fidelity animatable human avatar reconstruction.<n>LBS enables real-time skeletal control by driving global pose transformations.<n>Our method consistently outperforms state-of-the-art baselines in both reconstruction quality and operational efficiency.
arXiv Detail & Related papers (2025-10-25T03:23:38Z)
HGC-Avatar: Hierarchical Gaussian Compression for Streamable Dynamic 3D Avatars [45.746590759473435]
HGC-Avatar is a novel Hierarchical Gaussian Compression framework for efficient transmission and high-quality rendering of dynamic avatars.<n>We show that HGC-Avatar provides a streamable solution for rapid 3D avatar rendering, while significantly outperforming prior methods in both visual quality and compression efficiency.
arXiv Detail & Related papers (2025-10-18T12:03:26Z)
EVA: Expressive Virtual Avatars from Multi-view Videos [51.33851869426057]
We introduce Expressive Virtual Avatars (EVA), an actor-specific, fully controllable, and expressive human avatar framework.<n>EVA achieves high-fidelity, lifelike renderings in real time while enabling independent control of facial expressions, body movements, and hand gestures.<n>This work represents a significant advancement towards fully drivable digital human models.
arXiv Detail & Related papers (2025-05-21T11:22:52Z)
TeGA: Texture Space Gaussian Avatars for High-Resolution Dynamic Head Modeling [52.87836237427514]
Photoreal avatars are seen as a key component in emerging applications in telepresence, extended reality, and entertainment.<n>We present a new high-detail 3D head avatar model that improves upon the state of the art.
arXiv Detail & Related papers (2025-05-08T22:10:27Z)
RealityAvatar: Towards Realistic Loose Clothing Modeling in Animatable 3D Gaussian Avatars [4.332718737928592]
We propose RealityAvatar, an efficient framework for high-fidelity digital human modeling, specifically targeting loosely dressed avatars.<n>By incorporating a motion trend module and a latentbone encoder, we explicitly model pose-dependent deformations and temporal variations in clothing behavior.<n>Our method significantly enhances structural fidelity and perceptual quality in dynamic human reconstruction, particularly in non-rigid regions.
arXiv Detail & Related papers (2025-04-02T09:59:12Z)
EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis [61.1662426227688]
Existing NeRF and 3DGS-based methods show promising results in achieving photorealistic renderings but require slow, per-scene optimization. We introduce EVolSplat, an efficient 3D Gaussian Splatting model for urban scenes that works in a feed-forward manner.
arXiv Detail & Related papers (2025-03-26T02:47:27Z)
EvolvingGS: High-Fidelity Streamable Volumetric Video via Evolving 3D Gaussian Representation [14.402479944396665]
We introduce EvolvingGS, a two-stage strategy that first deforms the Gaussian model to align with the target frame, and then refines it with minimal point addition/subtraction. Owing to the flexibility of the incrementally evolving representation, our method outperforms existing approaches in terms of both per-frame and temporal quality metrics. Our method significantly advances the state-of-the-art in dynamic scene reconstruction, particularly for extended sequences with complex human performances.
arXiv Detail & Related papers (2025-03-07T06:01:07Z)
Deblur-Avatar: Animatable Avatars from Motion-Blurred Monocular Videos [64.10307207290039]
We introduce a novel framework for modeling high-fidelity, animatable 3D human avatars from motion-blurred monocular video inputs.<n>By explicitly modeling human motion trajectories during exposure time, we jointly optimize the trajectories and 3D Gaussians to reconstruct sharp, high-quality human avatars.
arXiv Detail & Related papers (2025-01-23T02:31:57Z)
Bundle Adjusted Gaussian Avatars Deblurring [31.718130377229482]
We propose a 3D-aware, physics-oriented model of blur formation attributable to human movement and a 3D human motion model to clarify ambiguities found in motion-induced blurry images. We have established benchmarks for this task through a synthetic dataset derived from existing multi-view captures, alongside a real-captured dataset acquired through a 360-degree synchronous hybrid-exposure camera system.
arXiv Detail & Related papers (2024-11-24T10:03:24Z)
DeSiRe-GS: 4D Street Gaussians for Static-Dynamic Decomposition and Surface Reconstruction for Urban Driving Scenes [71.61083731844282]
We present DeSiRe-GS, a self-supervised gaussian splatting representation. It enables effective static-dynamic decomposition and high-fidelity surface reconstruction in complex driving scenarios.
arXiv Detail & Related papers (2024-11-18T05:49:16Z)
DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D Diffusion [69.67970568012599]
We present DreamWaltz-G, a novel learning framework for animatable 3D avatar generation from text. The core of this framework lies in Score Distillation and Hybrid 3D Gaussian Avatar representation. Our framework further supports diverse applications, including human video reenactment and multi-subject scene composition.
arXiv Detail & Related papers (2024-09-25T17:59:45Z)
Gaussian Splatting LK [0.11249583407496218]
This paper investigates the potential of regularizing the native warp field within the dynamic Gaussian Splatting framework. We show that we can exploit knowledge innate to the forward warp field network to derive an analytical velocity field. This derived Lucas-Kanade style analytical regularization enables our method to achieve superior performance in reconstructing highly dynamic scenes.
arXiv Detail & Related papers (2024-07-16T01:50:43Z)
STGFormer: Spatio-Temporal GraphFormer for 3D Human Pose Estimation in Video [7.345621536750547]
This paper presents a graph-based framework for 3D human pose estimation in video. Specifically, we develop a graph-based attention mechanism, integrating graph information directly into the respective attention layers. We demonstrate that our method achieves significant stateof-the-art performance in 3D human pose estimation.
arXiv Detail & Related papers (2024-07-14T06:45:27Z)
Graph and Skipped Transformer: Exploiting Spatial and Temporal Modeling Capacities for Efficient 3D Human Pose Estimation [36.93661496405653]
We take a global approach to exploit Transformer-temporal information with a concise Graph and Skipped Transformer architecture. Specifically, in 3D pose stage, coarse-grained body parts are deployed to construct a fully data-driven adaptive model. Experiments are conducted on Human3.6M, MPI-INF-3DHP and Human-Eva benchmarks.
arXiv Detail & Related papers (2024-07-03T10:42:09Z)
NPGA: Neural Parametric Gaussian Avatars [46.52887358194364]
We propose a data-driven approach to create high-fidelity controllable avatars from multi-view video recordings. We build our method around 3D Gaussian splatting for its highly efficient rendering and to inherit the topological flexibility of point clouds. We evaluate our method on the public NeRSemble dataset, demonstrating that NPGA significantly outperforms the previous state-of-the-art avatars on the self-reenactment task by 2.6 PSNR.
arXiv Detail & Related papers (2024-05-29T17:58:09Z)
Spec-Gaussian: Anisotropic View-Dependent Appearance for 3D Gaussian Splatting [55.71424195454963]
Spec-Gaussian is an approach that utilizes an anisotropic spherical Gaussian appearance field instead of spherical harmonics. Our experimental results demonstrate that our method surpasses existing approaches in terms of rendering quality. This improvement extends the applicability of 3D GS to handle intricate scenarios with specular and anisotropic surfaces.
arXiv Detail & Related papers (2024-02-24T17:22:15Z)
Deformable 3D Gaussian Splatting for Animatable Human Avatars [50.61374254699761]
We propose a fully explicit approach to construct a digital avatar from as little as a single monocular sequence. ParDy-Human constitutes an explicit model for realistic dynamic human avatars which requires significantly fewer training views and images. Our avatars learning is free of additional annotations such as Splat masks and can be trained with variable backgrounds while inferring full-resolution images efficiently even on consumer hardware.
arXiv Detail & Related papers (2023-12-22T20:56:46Z)
TriHuman : A Real-time and Controllable Tri-plane Representation for Detailed Human Geometry and Appearance Synthesis [76.73338151115253]
TriHuman is a novel human-tailored, deformable, and efficient tri-plane representation. We non-rigidly warp global ray samples into our undeformed tri-plane texture space. We show how such a tri-plane feature representation can be conditioned on the skeletal motion to account for dynamic appearance and geometry changes.
arXiv Detail & Related papers (2023-12-08T16:40:38Z)
GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians [51.46168990249278]
We present an efficient approach to creating realistic human avatars with dynamic 3D appearances from a single video. GustafAvatar is validated on both the public dataset and our collected dataset.
arXiv Detail & Related papers (2023-12-04T18:55:45Z)
SC-GS: Sparse-Controlled Gaussian Splatting for Editable Dynamic Scenes [59.23385953161328]
Novel view synthesis for dynamic scenes is still a challenging problem in computer vision and graphics. We propose a new representation that explicitly decomposes the motion and appearance of dynamic scenes into sparse control points and dense Gaussians. Our method can enable user-controlled motion editing while retaining high-fidelity appearances.
arXiv Detail & Related papers (2023-12-04T11:57:14Z)
Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene Reconstruction [29.83056271799794]
Implicit neural representation has paved the way for new approaches to dynamic scene reconstruction and rendering. We propose a deformable 3D Gaussians Splatting method that reconstructs scenes using 3D Gaussians and learns them in canonical space. Through a differential Gaussianizer, the deformable 3D Gaussians not only achieve higher rendering quality but also real-time rendering speed.
arXiv Detail & Related papers (2023-09-22T16:04:02Z)
LiP-Flow: Learning Inference-time Priors for Codec Avatars via Normalizing Flows in Latent Space [90.74976459491303]
We introduce a prior model that is conditioned on the runtime inputs and tie this prior space to the 3D face model via a normalizing flow in the latent space. A normalizing flow bridges the two representation spaces and transforms latent samples from one domain to another, allowing us to define a latent likelihood objective. We show that our approach leads to an expressive and effective prior, capturing facial dynamics and subtle expressions better.
arXiv Detail & Related papers (2022-03-15T13:22:57Z)
Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition [79.33539539956186]
We propose a simple method to disentangle multi-scale graph convolutions and a unified spatial-temporal graph convolutional operator named G3D. By coupling these proposals, we develop a powerful feature extractor named MS-G3D based on which our model outperforms previous state-of-the-art methods on three large-scale datasets.
arXiv Detail & Related papers (2020-03-31T11:28:25Z)
A Graph Attention Spatio-temporal Convolutional Network for 3D Human Pose Estimation in Video [7.647599484103065]
We improve the learning of constraints in human skeleton by modeling local global spatial information via attention mechanisms. Our approach effectively mitigates depth ambiguity and self-occlusion, generalizes to half upper body estimation, and achieves competitive performance on 2D-to-3D video pose estimation.
arXiv Detail & Related papers (2020-03-11T14:54:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.