Related papers: DHQA-4D: Perceptual Quality Assessment of Dynamic 4D Digital Human

DHQA-4D: Perceptual Quality Assessment of Dynamic 4D Digital Human

URL: http://arxiv.org/abs/2510.03874v1
Date: Sat, 04 Oct 2025 16:51:08 GMT
Title: DHQA-4D: Perceptual Quality Assessment of Dynamic 4D Digital Human
Authors: Yunhao Li, Sijing Wu, Yucheng Zhu, Huiyu Duan, Zicheng Zhang, Guangtao Zhai,
Abstract summary: We propose a large-scale dynamic digital human quality assessment dataset, DHQA-4D, which contains 32 high-quality real-scanned 4D human mesh sequences.<n>We also propose DynaMesh-Rater, a novel large multimodal model (LMM) based approach that is able to assess both textured 4D meshes and non-textured 4D meshes.
Score: 78.54545352497217
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the rapid development of 3D scanning and reconstruction technologies, dynamic digital human avatars based on 4D meshes have become increasingly popular. A high-precision dynamic digital human avatar can be applied to various fields such as game production, animation generation, and remote immersive communication. However, these 4D human avatar meshes are prone to being degraded by various types of noise during the processes of collection, compression, and transmission, thereby affecting the viewing experience of users. In light of this fact, quality assessment of dynamic 4D digital humans becomes increasingly important. In this paper, we first propose a large-scale dynamic digital human quality assessment dataset, DHQA-4D, which contains 32 high-quality real-scanned 4D human mesh sequences, 1920 distorted textured 4D human meshes degraded by 11 textured distortions, as well as their corresponding textured and non-textured mean opinion scores (MOSs). Equipped with DHQA-4D dataset, we analyze the influence of different types of distortion on human perception for textured dynamic 4D meshes and non-textured dynamic 4D meshes. Additionally, we propose DynaMesh-Rater, a novel large multimodal model (LMM) based approach that is able to assess both textured 4D meshes and non-textured 4D meshes. Concretely, DynaMesh-Rater elaborately extracts multi-dimensional features, including visual features from a projected 2D video, motion features from cropped video clips, and geometry features from the 4D human mesh to provide comprehensive quality-related information. Then we utilize a LMM model to integrate the multi-dimensional features and conduct a LoRA-based instruction tuning technique to teach the LMM model to predict the quality scores. Extensive experimental results on the DHQA-4D dataset demonstrate the superiority of our DynaMesh-Rater method over previous quality assessment methods.

Related papers

DynaPose4D: High-Quality 4D Dynamic Content Generation via Pose Alignment Loss [5.644194272935956]
DynaPose4D is a framework that generates high-quality 4D dynamic content from a single static image.<n>Results show that DynaPose4D achieves excellent coherence, consistency, and fluidity in dynamic motion generation.
arXiv Detail & Related papers (2025-10-26T01:11:13Z)
4DNeX: Feed-Forward 4D Generative Modeling Made Easy [51.79072580042173]
We present 4DNeX, the first feed-forward framework for generating 4D (i.e., dynamic 3D) scene representations from a single image.<n>In contrast to existing methods that rely on computationally intensive optimization or require multi-frame video inputs, 4DNeX enables efficient, end-to-end image-to-4D generation.
arXiv Detail & Related papers (2025-08-18T17:59:55Z)
MVG4D: Image Matrix-Based Multi-View and Motion Generation for 4D Content Creation from a Single Image [8.22464804794448]
We propose MVG4D, a novel framework that generates dynamic 4D content from a single still image.<n>At its core, MVG4D employs an image matrix module that synthesizes temporally coherent and spatially diverse multi-view images.<n>Our method effectively enhances temporal consistency, geometric fidelity, and visual realism, addressing key challenges in motion discontinuity and background degradation.
arXiv Detail & Related papers (2025-07-24T12:48:14Z)
Advances in 4D Generation: A Survey [23.041037534410773]
4D generation enables richer interactive and immersive experiences.<n>Despite rapid progress, the field lacks a unified understanding of 4D representations, generative frameworks, basic paradigms, and the core technical challenges it faces.<n>This survey provides a systematic and in-depth review of the 4D generation landscape.
arXiv Detail & Related papers (2025-03-18T17:59:51Z)
4D Gaussian Splatting: Modeling Dynamic Scenes with Native 4D Primitives [115.67081491747943]
Dynamic 3D scene representation and novel view synthesis are crucial for enabling AR/VR and metaverse applications.<n>We reformulate the reconstruction of a time-varying 3D scene as approximating its underlying 4D volume.<n>We derive several compact variants that effectively reduce the memory footprint to address its storage bottleneck.
arXiv Detail & Related papers (2024-12-30T05:30:26Z)
Diffusion4D: Fast Spatial-temporal Consistent 4D Generation via Video Diffusion Models [116.31344506738816]
We present a novel framework, textbfDiffusion4D, for efficient and scalable 4D content generation. We develop a 4D-aware video diffusion model capable of synthesizing orbital views of dynamic 3D assets. Our method surpasses prior state-of-the-art techniques in terms of generation efficiency and 4D geometry consistency.
arXiv Detail & Related papers (2024-05-26T17:47:34Z)
LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human Modeling [69.56581851211841]
We propose a novel Local 4D implicit Representation for Dynamic clothed human, named LoRD. Our key insight is to encourage the network to learn the latent codes of local part-level representation. LoRD has strong capability for representing 4D human, and outperforms state-of-the-art methods on practical applications.
arXiv Detail & Related papers (2022-08-18T03:49:44Z)
H4D: Human 4D Modeling by Learning Neural Compositional Representation [75.34798886466311]
This work presents a novel framework that can effectively learn a compact and compositional representation for dynamic human. A simple yet effective linear motion model is proposed to provide a rough and regularized motion estimation. Experiments demonstrate our method is not only efficacy in recovering dynamic human with accurate motion and detailed geometry, but also amenable to various 4D human related tasks.
arXiv Detail & Related papers (2022-03-02T17:10:49Z)
HUMAN4D: A Human-Centric Multimodal Dataset for Motions and Immersive Media [16.711606354731533]
We introduce HUMAN4D, a large and multimodal 4D dataset that contains a variety of human activities captured simultaneously. We provide benchmarking by HUMAN4D with state-of-the-art human pose estimation and 3D pose estimation methods.
arXiv Detail & Related papers (2021-10-14T09:03:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.