CAG-Avatar: Cross-Attention Guided Gaussian Avatars for High-Fidelity Head Reconstruction
- URL: http://arxiv.org/abs/2601.14844v1
- Date: Wed, 21 Jan 2026 10:22:53 GMT
- Title: CAG-Avatar: Cross-Attention Guided Gaussian Avatars for High-Fidelity Head Reconstruction
- Authors: Zhe Chang, Haodong Jin, Yan Song, Hui Yu,
- Abstract summary: Animation techniques often rely on a "one-size-fits-all" global tuning approach.<n>We introduce Conditionally- Adaptive Fusion Module built on cross-attention.<n>Experiments confirm a significant improvement in reconstruction fidelity, particularly for challenging regions such as teeth.
- Score: 7.698661374784336
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Creating high-fidelity, real-time drivable 3D head avatars is a core challenge in digital animation. While 3D Gaussian Splashing (3D-GS) offers unprecedented rendering speed and quality, current animation techniques often rely on a "one-size-fits-all" global tuning approach, where all Gaussian primitives are uniformly driven by a single expression code. This simplistic approach fails to unravel the distinct dynamics of different facial regions, such as deformable skin versus rigid teeth, leading to significant blurring and distortion artifacts. We introduce Conditionally-Adaptive Gaussian Avatars (CAG-Avatar), a framework that resolves this key limitation. At its core is a Conditionally Adaptive Fusion Module built on cross-attention. This mechanism empowers each 3D Gaussian to act as a query, adaptively extracting relevant driving signals from the global expression code based on its canonical position. This "tailor-made" conditioning strategy drastically enhances the modeling of fine-grained, localized dynamics. Our experiments confirm a significant improvement in reconstruction fidelity, particularly for challenging regions such as teeth, while preserving real-time rendering performance.
Related papers
- FastGHA: Generalized Few-Shot 3D Gaussian Head Avatars with Real-Time Animation [26.161556787983496]
OURS is a feed-forward method to generate high-quality Gaussian head avatars from only a few input images.<n>Our approach directly learns a per-pixel Gaussian representation from the input images.<n>Experiments show that our approach significantly outperforms existing methods in both rendering quality and inference efficiency.
arXiv Detail & Related papers (2026-01-20T10:49:49Z) - CaricatureGS: Exaggerating 3D Gaussian Splatting Faces With Gaussian Curvature [13.47263744740423]
A controllable and controllable 3D caricaturization framework for faces is introduced.<n>We resort to 3D Gaussian Splatting (3DGS), which has recently been shown to produce realistic free-viewpoint avatars.
arXiv Detail & Related papers (2026-01-06T13:56:28Z) - TexAvatars : Hybrid Texel-3D Representations for Stable Rigging of Photorealistic Gaussian Head Avatars [47.957612931386926]
TexAvatars is a hybrid representation that combines the explicit geometric grounding of analytic rigging with the spatial continuity of texel space.<n>Our approach predicts local geometric attributes in UV space via CNNs, but drives 3D deformation through mesh-aware Jacobians.<n>Our method achieves state-of-the-art performance under extreme pose and expression variations, demonstrating strong generalization in challenging head reenactment settings.
arXiv Detail & Related papers (2025-12-24T10:50:04Z) - AGORA: Adversarial Generation Of Real-time Animatable 3D Gaussian Head Avatars [54.854597811704316]
AGORA is a novel framework that extends 3DGS within a generative adversarial network to produce animatable avatars.<n>Expression fidelity is enforced via a dual-discriminator training scheme.<n>AGORA generates avatars that are not only visually realistic but also precisely controllable.
arXiv Detail & Related papers (2025-12-06T14:05:20Z) - HGC-Avatar: Hierarchical Gaussian Compression for Streamable Dynamic 3D Avatars [45.746590759473435]
HGC-Avatar is a novel Hierarchical Gaussian Compression framework for efficient transmission and high-quality rendering of dynamic avatars.<n>We show that HGC-Avatar provides a streamable solution for rapid 3D avatar rendering, while significantly outperforming prior methods in both visual quality and compression efficiency.
arXiv Detail & Related papers (2025-10-18T12:03:26Z) - TeGA: Texture Space Gaussian Avatars for High-Resolution Dynamic Head Modeling [52.87836237427514]
Photoreal avatars are seen as a key component in emerging applications in telepresence, extended reality, and entertainment.<n>We present a new high-detail 3D head avatar model that improves upon the state of the art.
arXiv Detail & Related papers (2025-05-08T22:10:27Z) - Generalizable and Animatable Gaussian Head Avatar [50.34788590904843]
We propose Generalizable and Animatable Gaussian head Avatar (GAGAvatar) for one-shot animatable head avatar reconstruction.
We generate the parameters of 3D Gaussians from a single image in a single forward pass.
Our method exhibits superior performance compared to previous methods in terms of reconstruction quality and expression accuracy.
arXiv Detail & Related papers (2024-10-10T14:29:00Z) - GaussianStyle: Gaussian Head Avatar via StyleGAN [64.85782838199427]
We propose a novel framework that integrates the volumetric strengths of 3DGS with the powerful implicit representation of StyleGAN.
We show that our method achieves state-of-the-art performance in reenactment, novel view synthesis, and animation.
arXiv Detail & Related papers (2024-02-01T18:14:42Z) - GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning [60.33970027554299]
Gaussian splatting has emerged as a powerful 3D representation that harnesses the advantages of both explicit (mesh) and implicit (NeRF) 3D representations.
In this paper, we seek to leverage Gaussian splatting to generate realistic animatable avatars from textual descriptions.
Our proposed method, GAvatar, enables the large-scale generation of diverse animatable avatars using only text prompts.
arXiv Detail & Related papers (2023-12-18T18:59:12Z) - GaussianHead: High-fidelity Head Avatars with Learnable Gaussian Derivation [35.39887092268696]
This paper presents a framework to model the actional human head with anisotropic 3D Gaussians.<n>In experiments, our method can produce high-fidelity renderings, outperforming state-of-the-art approaches in reconstruction, cross-identity reenactment, and novel view synthesis tasks.
arXiv Detail & Related papers (2023-12-04T05:24:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.