AIGCBench: Comprehensive Evaluation of Image-to-Video Content Generated
by AI
- URL: http://arxiv.org/abs/2401.01651v3
- Date: Tue, 23 Jan 2024 15:31:17 GMT
- Title: AIGCBench: Comprehensive Evaluation of Image-to-Video Content Generated
by AI
- Authors: Fanda Fan, Chunjie Luo, Wanling Gao, Jianfeng Zhan
- Abstract summary: This paper introduces AIGCBench, a pioneering comprehensive benchmark designed to evaluate a variety of video generation tasks.
A varied and open-domain image-text dataset that evaluates different state-of-the-art algorithms under equivalent conditions.
We employ a novel text combiner and GPT-4 to create rich text prompts, which are then used to generate images via advanced Text-to-Image models.
- Score: 1.1035305628305816
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The burgeoning field of Artificial Intelligence Generated Content (AIGC) is
witnessing rapid advancements, particularly in video generation. This paper
introduces AIGCBench, a pioneering comprehensive and scalable benchmark
designed to evaluate a variety of video generation tasks, with a primary focus
on Image-to-Video (I2V) generation. AIGCBench tackles the limitations of
existing benchmarks, which suffer from a lack of diverse datasets, by including
a varied and open-domain image-text dataset that evaluates different
state-of-the-art algorithms under equivalent conditions. We employ a novel text
combiner and GPT-4 to create rich text prompts, which are then used to generate
images via advanced Text-to-Image models. To establish a unified evaluation
framework for video generation tasks, our benchmark includes 11 metrics
spanning four dimensions to assess algorithm performance. These dimensions are
control-video alignment, motion effects, temporal consistency, and video
quality. These metrics are both reference video-dependent and video-free,
ensuring a comprehensive evaluation strategy. The evaluation standard proposed
correlates well with human judgment, providing insights into the strengths and
weaknesses of current I2V algorithms. The findings from our extensive
experiments aim to stimulate further research and development in the I2V field.
AIGCBench represents a significant step toward creating standardized benchmarks
for the broader AIGC landscape, proposing an adaptable and equitable framework
for future assessments of video generation tasks. We have open-sourced the
dataset and evaluation code on the project website:
https://www.benchcouncil.org/AIGCBench.
Related papers
- VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context Videos [25.770675590118547]
VideoRAG is the first retrieval-augmented generation framework specifically designed for processing and understanding extremely long-context videos.
Our core innovation lies in its dual-channel architecture that seamlessly integrates (i) graph-based textual knowledge grounding for capturing cross-video semantic relationships, and (ii) multi-modal context encoding for efficiently preserving visual features.
arXiv Detail & Related papers (2025-02-03T17:30:19Z) - VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models [111.5892290894904]
VBench is a benchmark suite that dissects "video generation quality" into specific, hierarchical, and disentangled dimensions.
We provide a dataset of human preference annotations to validate our benchmarks' alignment with human perception.
VBench++ supports evaluating text-to-video and image-to-video.
arXiv Detail & Related papers (2024-11-20T17:54:41Z) - Benchmarking Multi-dimensional AIGC Video Quality Assessment: A Dataset and Unified Model [56.03592388332793]
We investigate the AIGC-VQA problem, considering both subjective and objective quality assessment perspectives.
For the subjective perspective, we construct the Large-scale Generated Video Quality assessment (LGVQ) dataset, consisting of 2,808 AIGC videos.
We evaluate the perceptual quality of AIGC videos from three critical dimensions: spatial quality, temporal quality, and text-video alignment.
We propose the Unify Generated Video Quality assessment (UGVQ) model, designed to accurately evaluate the multi-dimensional quality of AIGC videos.
arXiv Detail & Related papers (2024-07-31T07:54:26Z) - T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation [55.57459883629706]
We conduct the first systematic study on compositional text-to-video generation.
We propose T2V-CompBench, the first benchmark tailored for compositional text-to-video generation.
arXiv Detail & Related papers (2024-07-19T17:58:36Z) - TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and Image-to-Video Generation [97.96178992465511]
We argue that generated videos should incorporate the emergence of new concepts and their relation transitions like in real-world videos as time progresses.
To assess the Temporal Compositionality of video generation models, we propose TC-Bench, a benchmark of meticulously crafted text prompts, corresponding ground truth videos, and robust evaluation metrics.
arXiv Detail & Related papers (2024-06-12T21:41:32Z) - Exploring AIGC Video Quality: A Focus on Visual Harmony, Video-Text Consistency and Domain Distribution Gap [4.922783970210658]
We categorize the assessment of AIGC video quality into three dimensions: visual harmony, video-text consistency, and domain distribution gap.
For each dimension, we design specific modules to provide a comprehensive quality assessment of AIGC videos.
Our research identifies significant variations in visual quality, fluidity, and style among videos generated by different text-to-video models.
arXiv Detail & Related papers (2024-04-21T08:27:20Z) - Towards A Better Metric for Text-to-Video Generation [102.16250512265995]
Generative models have demonstrated remarkable capability in synthesizing high-quality text, images, and videos.
We introduce a novel evaluation pipeline, the Text-to-Video Score (T2VScore)
This metric integrates two pivotal criteria: (1) Text-Video Alignment, which scrutinizes the fidelity of the video in representing the given text description, and (2) Video Quality, which evaluates the video's overall production caliber with a mixture of experts.
arXiv Detail & Related papers (2024-01-15T15:42:39Z) - Make It Move: Controllable Image-to-Video Generation with Text
Descriptions [69.52360725356601]
TI2V task aims at generating videos from a static image and a text description.
To address these challenges, we propose a Motion Anchor-based video GEnerator (MAGE) with an innovative motion anchor structure.
Experiments conducted on datasets verify the effectiveness of MAGE and show appealing potentials of TI2V task.
arXiv Detail & Related papers (2021-12-06T07:00:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.