GAIA: Rethinking Action Quality Assessment for AI-Generated Videos
- URL: http://arxiv.org/abs/2406.06087v2
- Date: Mon, 14 Oct 2024 15:45:24 GMT
- Title: GAIA: Rethinking Action Quality Assessment for AI-Generated Videos
- Authors: Zijian Chen, Wei Sun, Yuan Tian, Jun Jia, Zicheng Zhang, Jiarui Wang, Ru Huang, Xiongkuo Min, Guangtao Zhai, Wenjun Zhang,
- Abstract summary: Action quality assessment (AQA) algorithms predominantly focus on actions from real specific scenarios and are pre-trained with normative action features.
We construct GAIA, a Generic AI-generated Action dataset, by conducting a large-scale subjective evaluation from a novel causal reasoning-based perspective.
Results show that traditional AQA methods, action-related metrics in recent T2V benchmarks, and mainstream video quality methods perform poorly with an average SRCC of 0.454, 0.191, and 0.519, respectively.
- Score: 56.047773400426486
- License:
- Abstract: Assessing action quality is both imperative and challenging due to its significant impact on the quality of AI-generated videos, further complicated by the inherently ambiguous nature of actions within AI-generated video (AIGV). Current action quality assessment (AQA) algorithms predominantly focus on actions from real specific scenarios and are pre-trained with normative action features, thus rendering them inapplicable in AIGVs. To address these problems, we construct GAIA, a Generic AI-generated Action dataset, by conducting a large-scale subjective evaluation from a novel causal reasoning-based perspective, resulting in 971,244 ratings among 9,180 video-action pairs. Based on GAIA, we evaluate a suite of popular text-to-video (T2V) models on their ability to generate visually rational actions, revealing their pros and cons on different categories of actions. We also extend GAIA as a testbed to benchmark the AQA capacity of existing automatic evaluation methods. Results show that traditional AQA methods, action-related metrics in recent T2V benchmarks, and mainstream video quality methods perform poorly with an average SRCC of 0.454, 0.191, and 0.519, respectively, indicating a sizable gap between current models and human action perception patterns in AIGVs. Our findings underscore the significance of action quality as a unique perspective for studying AIGVs and can catalyze progress towards methods with enhanced capacities for AQA in AIGVs.
Related papers
- AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of Text-to-Video Generation with LMM [54.44479359918971]
We first present AIGVQA-DB, a large-scale dataset comprising 36,576 AIGVs generated by 15 advanced text-to-video models using 1,048 prompts.
We then introduce AIGV-Assessor, a novel VQA model that leverages intricate quality attributes to capture precise video quality scores and pair video preferences.
arXiv Detail & Related papers (2024-11-26T08:43:15Z) - Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric [56.73624246192218]
We conduct a pioneering study on human activity AI-generated videos (AGVs)
We focus on visual quality evaluation and the identification of semantic distortions.
We develop an objective evaluation metric, named AI-Generated Human activity Video Quality metric (GHVQ), to automatically analyze the quality of human activity AGVs.
arXiv Detail & Related papers (2024-11-25T17:58:43Z) - Advancing Video Quality Assessment for AIGC [17.23281750562252]
We propose a novel loss function that combines mean absolute error with cross-entropy loss to mitigate inter-frame quality inconsistencies.
We also introduce the innovative S2CNet technique to retain critical content, while leveraging adversarial training to enhance the model's generalization capabilities.
arXiv Detail & Related papers (2024-09-23T10:36:22Z) - Revisiting Video Quality Assessment from the Perspective of Generalization [17.695835285573807]
Short video platforms such as YouTube Shorts, TikTok, and Kwai have led to a surge in User-Generated Content (UGC)
These challenges not only affect performance on test sets but also impact the ability to generalize across different datasets.
We show that adversarial weight perturbations can effectively smooth this landscape, significantly improving the generalization performance.
arXiv Detail & Related papers (2024-09-23T09:24:55Z) - Benchmarking AIGC Video Quality Assessment: A Dataset and Unified Model [54.69882562863726]
We try to systemically investigate the AIGC-VQA problem from both subjective and objective quality assessment perspectives.
We evaluate the perceptual quality of AIGC videos from three dimensions: spatial quality, temporal quality, and text-to-video alignment.
We propose a Unify Generated Video Quality assessment (UGVQ) model to comprehensively and accurately evaluate the quality of AIGC videos.
arXiv Detail & Related papers (2024-07-31T07:54:26Z) - AIGCOIQA2024: Perceptual Quality Assessment of AI Generated Omnidirectional Images [70.42666704072964]
We establish a large-scale AI generated omnidirectional image IQA database named AIGCOIQA2024.
A subjective IQA experiment is conducted to assess human visual preferences from three perspectives.
We conduct a benchmark experiment to evaluate the performance of state-of-the-art IQA models on our database.
arXiv Detail & Related papers (2024-04-01T10:08:23Z) - Group-aware Contrastive Regression for Action Quality Assessment [85.43203180953076]
We show that the relations among videos can provide important clues for more accurate action quality assessment.
Our approach outperforms previous methods by a large margin and establishes new state-of-the-art on all three benchmarks.
arXiv Detail & Related papers (2021-08-17T17:59:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.