RenderMe-360: A Large Digital Asset Library and Benchmarks Towards
High-fidelity Head Avatars
- URL: http://arxiv.org/abs/2305.13353v1
- Date: Mon, 22 May 2023 17:54:01 GMT
- Title: RenderMe-360: A Large Digital Asset Library and Benchmarks Towards
High-fidelity Head Avatars
- Authors: Dongwei Pan, Long Zhuo, Jingtan Piao, Huiwen Luo, Wei Cheng, Yuxin
Wang, Siming Fan, Shengqi Liu, Lei Yang, Bo Dai, Ziwei Liu, Chen Change Loy,
Chen Qian, Wayne Wu, Dahua Lin, Kwan-Yee Lin
- Abstract summary: We present RenderMe-360, a comprehensive 4D human head dataset to drive advance in head avatar research.
It contains massive data assets, with 243+ million complete head frames, and over 800k video sequences from 500 different identities.
Based on the dataset, we build a comprehensive benchmark for head avatar research, with 16 state-of-the-art methods performed on five main tasks.
- Score: 157.82758221794452
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Synthesizing high-fidelity head avatars is a central problem for computer
vision and graphics. While head avatar synthesis algorithms have advanced
rapidly, the best ones still face great obstacles in real-world scenarios. One
of the vital causes is inadequate datasets -- 1) current public datasets can
only support researchers to explore high-fidelity head avatars in one or two
task directions; 2) these datasets usually contain digital head assets with
limited data volume, and narrow distribution over different attributes. In this
paper, we present RenderMe-360, a comprehensive 4D human head dataset to drive
advance in head avatar research. It contains massive data assets, with 243+
million complete head frames, and over 800k video sequences from 500 different
identities captured by synchronized multi-view cameras at 30 FPS. It is a
large-scale digital library for head avatars with three key attributes: 1) High
Fidelity: all subjects are captured by 60 synchronized, high-resolution 2K
cameras in 360 degrees. 2) High Diversity: The collected subjects vary from
different ages, eras, ethnicities, and cultures, providing abundant materials
with distinctive styles in appearance and geometry. Moreover, each subject is
asked to perform various motions, such as expressions and head rotations, which
further extend the richness of assets. 3) Rich Annotations: we provide
annotations with different granularities: cameras' parameters, matting, scan,
2D/3D facial landmarks, FLAME fitting, and text description.
Based on the dataset, we build a comprehensive benchmark for head avatar
research, with 16 state-of-the-art methods performed on five main tasks: novel
view synthesis, novel expression synthesis, hair rendering, hair editing, and
talking head generation. Our experiments uncover the strengths and weaknesses
of current methods. RenderMe-360 opens the door for future exploration in head
avatars.
Related papers
- Head360: Learning a Parametric 3D Full-Head for Free-View Synthesis in 360° [25.86740659962933]
We build a dataset of artist-designed high-fidelity human heads and propose to create a novel parametric head model from it.
Our scheme decouples the facial motion/shape and facial appearance, which are represented by a classic parametric 3D mesh model and an attached neural texture.
Experiments show that facial motions and appearances are well disentangled in the parametric space, leading to SOTA performance in rendering and animating quality.
arXiv Detail & Related papers (2024-08-01T05:46:06Z) - ILSH: The Imperial Light-Stage Head Dataset for Human Head View
Synthesis [42.81410101705251]
Imperial Light-Stage Head dataset is a novel dataset designed to support view synthesis academic challenges for human heads.
This paper details the setup of a light-stage specifically designed to capture high-resolution (4K) human head images.
In addition to the data collection, we address the split of the dataset into train, validation, and test sets.
arXiv Detail & Related papers (2023-10-06T00:32:36Z) - DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity
Human-centric Rendering [126.00165445599764]
We present DNA-Rendering, a large-scale, high-fidelity repository of human performance data for neural actor rendering.
Our dataset contains over 1500 human subjects, 5000 motion sequences, and 67.5M frames' data volume.
We construct a professional multi-view system to capture data, which contains 60 synchronous cameras with max 4096 x 3000 resolution, 15 fps speed, and stern camera calibration steps.
arXiv Detail & Related papers (2023-07-19T17:58:03Z) - HeadSculpt: Crafting 3D Head Avatars with Text [143.14548696613886]
We introduce a versatile pipeline dubbed HeadSculpt for crafting 3D head avatars from textual prompts.
We first equip the diffusion model with 3D awareness by leveraging landmark-based control and a learned textual embedding.
We propose a novel identity-aware editing score distillation strategy to optimize a textured mesh with a high-resolution differentiable rendering technique.
arXiv Detail & Related papers (2023-06-05T16:53:58Z) - Head3D: Complete 3D Head Generation via Tri-plane Feature Distillation [56.267877301135634]
Current full head generation methods require a large number of 3D scans or multi-view images to train the model.
We propose Head3D, a method to generate full 3D heads with limited multi-view images.
Our model achieves cost-efficient and diverse complete head generation with photo-realistic renderings and high-quality geometry representations.
arXiv Detail & Related papers (2023-03-28T11:12:26Z) - MVP-Human Dataset for 3D Human Avatar Reconstruction from Unconstrained
Frames [59.37430649840777]
We present 3D Avatar Reconstruction in the wild (ARwild), which first reconstructs the implicit skinning fields in a multi-level manner.
We contribute a large-scale dataset, MVP-Human, which contains 400 subjects, each of which has 15 scans in different poses.
Overall, benefits from the specific network architecture and the diverse data, the trained model enables 3D avatar reconstruction from unconstrained frames.
arXiv Detail & Related papers (2022-04-24T03:57:59Z) - DRaCoN -- Differentiable Rasterization Conditioned Neural Radiance
Fields for Articulated Avatars [92.37436369781692]
We present DRaCoN, a framework for learning full-body volumetric avatars.
It exploits the advantages of both the 2D and 3D neural rendering techniques.
Experiments on the challenging ZJU-MoCap and Human3.6M datasets indicate that DRaCoN outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-03-29T17:59:15Z) - SHD360: A Benchmark Dataset for Salient Human Detection in 360{\deg}
Videos [26.263614207849276]
We propose SHD360, the first 360deg video SHD dataset collecting various real-life daily scenes.
SHD360 contains 16,238 salient human instances with manually annotated pixel-wise ground truth.
Our proposed dataset and benchmark could serve as a good starting point for advancing human-centric researches towards 360deg panoramic data.
arXiv Detail & Related papers (2021-05-24T23:51:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.