Related papers: FineCausal: A Causal-Based Framework for Interpretable Fine-Grained Action Quality Assessment

FineCausal: A Causal-Based Framework for Interpretable Fine-Grained Action Quality Assessment

URL: http://arxiv.org/abs/2503.23911v1
Date: Mon, 31 Mar 2025 10:02:29 GMT
Title: FineCausal: A Causal-Based Framework for Interpretable Fine-Grained Action Quality Assessment
Authors: Ruisheng Han, Kanglei Zhou, Amir Atapour-Abarghouei, Xiaohui Liang, Hubert P. H. Shum,
Abstract summary: We introduce FineusDival, a novel causal-based framework that achieves state-of-the-art performance on the Fineing-HMCa dataset.<n>Our approach leverages a Graph Attention Network-based causal intervention module to disentangle human-centric cues from background confounders.<n>Our dual-module strategy enables FineCausal to generate detailed temporal-temporal representations that not only achieve state-of-the-art scoring performance but also provide transparent, interpretable feedback on which features drive the assessment.
Score: 13.936546696317617
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Action quality assessment (AQA) is critical for evaluating athletic performance, informing training strategies, and ensuring safety in competitive sports. However, existing deep learning approaches often operate as black boxes and are vulnerable to spurious correlations, limiting both their reliability and interpretability. In this paper, we introduce FineCausal, a novel causal-based framework that achieves state-of-the-art performance on the FineDiving-HM dataset. Our approach leverages a Graph Attention Network-based causal intervention module to disentangle human-centric foreground cues from background confounders, and incorporates a temporal causal attention module to capture fine-grained temporal dependencies across action stages. This dual-module strategy enables FineCausal to generate detailed spatio-temporal representations that not only achieve state-of-the-art scoring performance but also provide transparent, interpretable feedback on which features drive the assessment. Despite its strong performance, FineCausal requires extensive expert knowledge to define causal structures and depends on high-quality annotations, challenges that we discuss and address as future research directions. Code is available at https://github.com/Harrison21/FineCausal.

Related papers

CC-VQA: Conflict- and Correlation-Aware Method for Mitigating Knowledge Conflict in Knowledge-Based Visual Question Answering [53.7094431951084]
Knowledge-based visual question answering (KB-VQA) demonstrates significant potential for handling knowledge-intensive tasks.<n>Conflicts arise between static parametric knowledge in vision language models and dynamically retrieved information.<n>We propose textbfCC-VQA as a training-free, conflict- and correlation-aware method for KB-VQA.
arXiv Detail & Related papers (2026-02-27T11:56:26Z)
CoG: Controllable Graph Reasoning via Relational Blueprints and Failure-Aware Refinement over Knowledge Graphs [53.199517625701475]
CoG is a training-free framework inspired by Dual-Process Theory that mimics the interplay between intuition and deliberation.<n>CoG significantly outperforms state-of-the-art approaches in both accuracy and efficiency.
arXiv Detail & Related papers (2026-01-16T07:27:40Z)
CaFlow: Enhancing Long-Term Action Quality Assessment with Causal Counterfactual Flow [25.3923767595433]
Action Quality Assessment (AQA) predicts fine-grained execution scores from action videos.<n>Long-term AQA, as in figure skating or rhythmic gymnastics, is especially challenging since it requires modeling extended temporal dynamics.<n>We propose CaFlow, a unified framework that integrates counterfactual de-confounding with bidirectional time-conditioned flow.
arXiv Detail & Related papers (2025-11-26T18:25:41Z)
Continual Action Quality Assessment via Adaptive Manifold-Aligned Graph Regularization [53.82400605816587]
Action Quality Assessment (AQA) quantifies human actions in videos, supporting applications in sports scoring, rehabilitation, and skill evaluation.<n>A major challenge lies in the non-stationary nature of quality distributions in real-world scenarios.<n>We introduce Continual AQA (CAQA), which equips with Continual Learning capabilities to handle evolving distributions.
arXiv Detail & Related papers (2025-10-08T10:09:47Z)
Temporalizing Confidence: Evaluation of Chain-of-Thought Reasoning with Signal Temporal Logic [0.12499537119440243]
We propose a structured framework that models stepwise confidence as a temporal signal and evaluates it using Signal Temporal Logic (STL)<n>In particular, we define formal STL-based constraints to capture desirable temporal properties and compute scores that serve as structured, interpretable confidence estimates.<n>Our approach consistently improves calibration metrics and provides more reliable uncertainty estimates than conventional confidence aggregation and post-hoc calibration.
arXiv Detail & Related papers (2025-06-09T21:21:12Z)
How Breakable Is Privacy: Probing and Resisting Model Inversion Attacks in Collaborative Inference [9.092229145160763]
Collaborative inference improves computational efficiency for edge devices by transmitting intermediate features to cloud models.<n>There is no established criterion for assessing model inversion attacks (MIAs)<n>We propose SiftFunnel, a privacy-preserving framework to resist MIA while maintaining usability.
arXiv Detail & Related papers (2025-01-01T13:00:01Z)
FineParser: A Fine-grained Spatio-temporal Action Parser for Human-centric Action Quality Assessment [30.601466217201253]
Existing action quality assessment (AQA) methods mainly learn deep representations at the video level for scoring diverse actions. Due to the lack of a fine-grained understanding of actions in videos, they harshly suffer from low credibility and interpretability, thus insufficient for stringent applications, such as Olympic diving events. We argue that a fine-grained understanding of actions requires the model to perceive and parse actions in both time and space, which is also the key to the credibility and interpretability of the AQA technique.
arXiv Detail & Related papers (2024-05-11T02:57:16Z)
Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework. Our importance weights are obtained by optimizing the KL-divergence regularized loss function. Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z)
Learning Prompt-Enhanced Context Features for Weakly-Supervised Video Anomaly Detection [37.99031842449251]
Video anomaly detection under weak supervision presents significant challenges. We present a weakly supervised anomaly detection framework that focuses on efficient context modeling and enhanced semantic discriminability. Our approach significantly improves the detection accuracy of certain anomaly sub-classes, underscoring its practical value and efficacy.
arXiv Detail & Related papers (2023-06-26T06:45:16Z)
Fairness meets Cross-Domain Learning: a new perspective on Models and Metrics [80.07271410743806]
We study the relationship between cross-domain learning (CD) and model fairness. We introduce a benchmark on face and medical images spanning several demographic groups as well as classification and localization tasks. Our study covers 14 CD approaches alongside three state-of-the-art fairness algorithms and shows how the former can outperform the latter.
arXiv Detail & Related papers (2023-03-25T09:34:05Z)
When Demonstrations Meet Generative World Models: A Maximum Likelihood Framework for Offline Inverse Reinforcement Learning [62.00672284480755]
This paper aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent. Accurate models of expertise in executing a task has applications in safety-sensitive applications such as clinical decision making and autonomous driving.
arXiv Detail & Related papers (2023-02-15T04:14:20Z)
On Feature Learning in the Presence of Spurious Correlations [45.86963293019703]
We show that the quality learned feature representations is greatly affected by the design decisions beyond the method. We significantly improve upon the best results reported in the literature on the popular Waterbirds, Celeb hair color prediction and WILDS-FMOW problems.
arXiv Detail & Related papers (2022-10-20T16:10:28Z)
ReAct: Temporal Action Detection with Relational Queries [84.76646044604055]
This work aims at advancing temporal action detection (TAD) using an encoder-decoder framework with action queries. We first propose a relational attention mechanism in the decoder, which guides the attention among queries based on their relations. Lastly, we propose to predict the localization quality of each action query at inference in order to distinguish high-quality queries.
arXiv Detail & Related papers (2022-07-14T17:46:37Z)
FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality Assessment [93.09267863425492]
We argue that understanding both high-level semantics and internal temporal structures of actions in competitive sports videos is the key to making predictions accurate and interpretable. We construct a new fine-grained dataset, called FineDiving, developed on diverse diving events with detailed annotations on action procedures.
arXiv Detail & Related papers (2022-04-07T17:59:32Z)
Unsupervised Continual Learning via Self-Adaptive Deep Clustering Approach [20.628084936538055]
Knowledge Retention in Self-Adaptive Deep Continual Learner, (KIERA) is proposed in this paper. KIERA is developed from the notion of flexible deep clustering approach possessing an elastic network structure to cope with changing environments in the timely manner.
arXiv Detail & Related papers (2021-06-28T10:37:14Z)
More Than Just Attention: Learning Cross-Modal Attentions with Contrastive Constraints [63.08768589044052]
We propose Contrastive Content Re-sourcing ( CCR) and Contrastive Content Swapping ( CCS) constraints to address such limitation. CCR and CCS constraints supervise the training of attention models in a contrastive learning manner without requiring explicit attention annotations. Experiments on both Flickr30k and MS-COCO datasets demonstrate that integrating these attention constraints into two state-of-the-art attention-based models improves the model performance.
arXiv Detail & Related papers (2021-05-20T08:48:10Z)
Spectrum-Guided Adversarial Disparity Learning [52.293230153385124]
We propose a novel end-to-end knowledge directed adversarial learning framework. It portrays the class-conditioned intraclass disparity using two competitive encoding distributions and learns the purified latent codes by denoising learned disparity. The experiments on four HAR benchmark datasets demonstrate the robustness and generalization of our proposed methods over a set of state-of-the-art.
arXiv Detail & Related papers (2020-07-14T05:46:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.