Multi-Dimensional Quality Assessment for Text-to-3D Assets: Dataset and Model
- URL: http://arxiv.org/abs/2502.16915v1
- Date: Mon, 24 Feb 2025 07:20:13 GMT
- Title: Multi-Dimensional Quality Assessment for Text-to-3D Assets: Dataset and Model
- Authors: Kang Fu, Huiyu Duan, Zicheng Zhang, Xiaohong Liu, Xiongkuo Min, Jia Wang, Guangtao Zhai,
- Abstract summary: Despite the growing popularity of text-to-3D asset generation, its evaluation has not been well considered and studied.<n>Given the significant quality discrepancies among various text-to-3D assets, there is a pressing need for quality assessment models aligned with human subjective judgments.<n>We first establish the largest text-to-3D asset quality assessment database to date, termed the AIGC-T23DAQA database.
- Score: 54.71130068043388
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in text-to-image (T2I) generation have spurred the development of text-to-3D asset (T23DA) generation, leveraging pretrained 2D text-to-image diffusion models for text-to-3D asset synthesis. Despite the growing popularity of text-to-3D asset generation, its evaluation has not been well considered and studied. However, given the significant quality discrepancies among various text-to-3D assets, there is a pressing need for quality assessment models aligned with human subjective judgments. To tackle this challenge, we conduct a comprehensive study to explore the T23DA quality assessment (T23DAQA) problem in this work from both subjective and objective perspectives. Given the absence of corresponding databases, we first establish the largest text-to-3D asset quality assessment database to date, termed the AIGC-T23DAQA database. This database encompasses 969 validated 3D assets generated from 170 prompts via 6 popular text-to-3D asset generation models, and corresponding subjective quality ratings for these assets from the perspectives of quality, authenticity, and text-asset correspondence, respectively. Subsequently, we establish a comprehensive benchmark based on the AIGC-T23DAQA database, and devise an effective T23DAQA model to evaluate the generated 3D assets from the aforementioned three perspectives, respectively.
Related papers
- Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation [134.53804996949287]
We introduce Eval3D, a fine-grained, interpretable evaluation tool that can faithfully evaluate the quality of generated 3D assets.
Our key observation is that many desired properties of 3D generation, such as semantic and geometric consistency, can be effectively captured.
Compared to prior work, Eval3D provides pixel-wise measurement, enables accurate 3D spatial feedback, and aligns more closely with human judgments.
arXiv Detail & Related papers (2025-04-25T17:22:05Z) - Benchmarking and Learning Multi-Dimensional Quality Evaluator for Text-to-3D Generation [26.0726219629689]
Text-to-3D generation has achieved remarkable progress in recent years, yet evaluating these methods remains challenging.<n>Existing benchmarks lack fine-grained evaluation on different prompt categories and evaluation dimensions.<n>We first propose a comprehensive benchmark named MATE-3D.<n>The benchmark contains eight well-designed prompt categories that cover single and multiple object generation, resulting in 1,280 generated textured meshes.
arXiv Detail & Related papers (2024-12-15T12:41:44Z) - 3DGCQA: A Quality Assessment Database for 3D AI-Generated Contents [50.730468291265886]
This paper introduces a novel 3DGC quality assessment dataset, 3DGCQA, built using 7 representative Text-to-3D generation methods.
The visualization intuitively reveals the presence of 6 common distortion categories in the generated 3DGCs.
subjective quality assessment is conducted by evaluators, whose ratings reveal significant variation in quality across different generation methods.
Several objective quality assessment algorithms are tested on the 3DGCQA dataset.
arXiv Detail & Related papers (2024-09-11T12:47:40Z) - T$^3$Bench: Benchmarking Current Progress in Text-to-3D Generation [52.029698642883226]
Methods in text-to-3D leverage powerful pretrained diffusion models to optimize NeRF.
Most studies evaluate their results with subjective case studies and user experiments.
We introduce T$3$Bench, the first comprehensive text-to-3D benchmark.
arXiv Detail & Related papers (2023-10-04T17:12:18Z) - Advancing Zero-Shot Digital Human Quality Assessment through
Text-Prompted Evaluation [60.873105678086404]
SJTU-H3D is a subjective quality assessment database specifically designed for full-body digital humans.
It comprises 40 high-quality reference digital humans and 1,120 labeled distorted counterparts generated with seven types of distortions.
arXiv Detail & Related papers (2023-07-06T06:55:30Z) - AGIQA-3K: An Open Database for AI-Generated Image Quality Assessment [62.8834581626703]
We build the most comprehensive subjective quality database AGIQA-3K so far.
We conduct a benchmark experiment on this database to evaluate the consistency between the current Image Quality Assessment (IQA) model and human perception.
We believe that the fine-grained subjective scores in AGIQA-3K will inspire subsequent AGI quality models to fit human subjective perception mechanisms.
arXiv Detail & Related papers (2023-06-07T18:28:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.