Related papers: Multi-Stage Contrastive Regression for Action Quality Assessment

Multi-Stage Contrastive Regression for Action Quality Assessment

URL: http://arxiv.org/abs/2401.02841v1
Date: Fri, 5 Jan 2024 14:48:19 GMT
Title: Multi-Stage Contrastive Regression for Action Quality Assessment
Authors: Qi An, Mengshi Qi, Huadong Ma
Abstract summary: We propose a novel Multi-stage Contrastive Regression (MCoRe) framework for the action quality assessment (AQA) task. Inspired by the graph contrastive learning, we propose a new stage-wise contrastive learning loss function to enhance performance. MCoRe demonstrates the state-of-the-art result so far on the widely-adopted fine-grained AQA dataset.
Score: 31.763380011104015
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In recent years, there has been growing interest in the video-based action quality assessment (AQA). Most existing methods typically solve AQA problem by considering the entire video yet overlooking the inherent stage-level characteristics of actions. To address this issue, we design a novel Multi-stage Contrastive Regression (MCoRe) framework for the AQA task. This approach allows us to efficiently extract spatial-temporal information, while simultaneously reducing computational costs by segmenting the input video into multiple stages or procedures. Inspired by the graph contrastive learning, we propose a new stage-wise contrastive learning loss function to enhance performance. As a result, MCoRe demonstrates the state-of-the-art result so far on the widely-adopted fine-grained AQA dataset.

Related papers

ReasVQA: Advancing VideoQA with Imperfect Reasoning Process [38.4638171723351]
textbfReasVQA (Reasoning-enhanced Video Question Answering) is a novel approach that leverages reasoning processes generated by Multimodal Large Language Models (MLLMs) to improve the performance of VideoQA models. We evaluate ReasVQA on three popular benchmarks, and our results establish new state-of-the-art performance with significant improvements of +2.9 on NExT-QA, +7.3 on STAR, and +5.9 on IntentQA.
arXiv Detail & Related papers (2025-01-23T10:35:22Z)
Action Quality Assessment via Hierarchical Pose-guided Multi-stage Contrastive Regression [25.657978409890973]
Action Assessment (AQA) aims at automatic and fair evaluation of athletic performance. Current methods focus on segmenting video into fixed frames, which disrupts the temporal continuity of sub-actions. We propose a novel action quality assessment method through hierarchically pose-guided multi-stage contrastive regression.
arXiv Detail & Related papers (2025-01-07T10:20:16Z)
Empowering Large Language Model for Continual Video Question Answering with Collaborative Prompting [15.161997580529075]
This paper explores the novel challenge of VideoQA within a continual learning framework. We propose Collaborative Prompting (ColPro), which integrates specific question constraint prompting, knowledge acquisition prompting, and visual temporal awareness prompting. Experimental results on the NExT-QA and DramaQA datasets show that ColPro achieves superior performance compared to existing approaches.
arXiv Detail & Related papers (2024-10-01T15:07:07Z)
Interpretable Long-term Action Quality Assessment [12.343701556374556]
Long-term Action Quality Assessment (AQA) evaluates the execution of activities in videos. Current AQA methods produce a single score by averaging clip features. Long-term videos pose additional difficulty due to the complexity and diversity of actions.
arXiv Detail & Related papers (2024-08-21T15:09:09Z)
KaPQA: Knowledge-Augmented Product Question-Answering [59.096607961704656]
We introduce two product question-answering (QA) datasets focused on Adobe Acrobat and Photoshop products. We also propose a novel knowledge-driven RAG-QA framework to enhance the performance of the models in the product QA task.
arXiv Detail & Related papers (2024-07-22T22:14:56Z)
GAIA: Rethinking Action Quality Assessment for AI-Generated Videos [56.047773400426486]
Action quality assessment (AQA) algorithms predominantly focus on actions from real specific scenarios and are pre-trained with normative action features. We construct GAIA, a Generic AI-generated Action dataset, by conducting a large-scale subjective evaluation from a novel causal reasoning-based perspective. Results show that traditional AQA methods, action-related metrics in recent T2V benchmarks, and mainstream video quality methods perform poorly with an average SRCC of 0.454, 0.191, and 0.519, respectively.
arXiv Detail & Related papers (2024-06-10T08:18:07Z)
Continual Action Assessment via Task-Consistent Score-Discriminative Feature Distribution Modeling [31.696222064667243]
Action Quality Assessment (AQA) is a task that tries to answer how well an action is carried out. Existing works on AQA assume that all the training data are visible for training at one time, but do not enable continual learning. We propose a unified model to learn AQA tasks sequentially without forgetting.
arXiv Detail & Related papers (2023-09-29T10:06:28Z)
Solving Continuous Control via Q-learning [54.05120662838286]
We show that a simple modification of deep Q-learning largely alleviates issues with actor-critic methods. By combining bang-bang action discretization with value decomposition, framing single-agent control as cooperative multi-agent reinforcement learning (MARL), this simple critic-only approach matches performance of state-of-the-art continuous actor-critic methods.
arXiv Detail & Related papers (2022-10-22T22:55:50Z)
Action Quality Assessment with Temporal Parsing Transformer [84.1272079121699]
Action Quality Assessment (AQA) is important for action understanding and resolving the task poses unique challenges due to subtle visual differences. We propose a temporal parsing transformer to decompose the holistic feature into temporal part-level representations. Our proposed method outperforms prior work on three public AQA benchmarks by a considerable margin.
arXiv Detail & Related papers (2022-07-19T13:29:05Z)
Auto-Encoding Score Distribution Regression for Action Quality Assessment [41.45638722765149]
Action quality assessment (AQA) from videos is a challenging vision task. Traditionally, AQA task is treated as a regression problem to learn the underlying mappings between videos and action scores. We develop Distribution Auto-Encoder (DAE) to address the above problems.
arXiv Detail & Related papers (2021-11-22T07:30:04Z)
Group-aware Contrastive Regression for Action Quality Assessment [85.43203180953076]
We show that the relations among videos can provide important clues for more accurate action quality assessment. Our approach outperforms previous methods by a large margin and establishes new state-of-the-art on all three benchmarks.
arXiv Detail & Related papers (2021-08-17T17:59:39Z)
Temporal Context Aggregation for Video Retrieval with Contrastive Learning [81.12514007044456]
We propose TCA, a video representation learning framework that incorporates long-range temporal information between frame-level features. The proposed method shows a significant performance advantage (17% mAP on FIVR-200K) over state-of-the-art methods with video-level features.
arXiv Detail & Related papers (2020-08-04T05:24:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.