Related papers: Unlabeled Action Quality Assessment Based on Multi-dimensional Adaptive Constrained Dynamic Time Warping

Unlabeled Action Quality Assessment Based on Multi-dimensional Adaptive Constrained Dynamic Time Warping

URL: http://arxiv.org/abs/2410.14161v2
Date: Sun, 27 Oct 2024 09:23:05 GMT
Title: Unlabeled Action Quality Assessment Based on Multi-dimensional Adaptive Constrained Dynamic Time Warping
Authors: Renguang Chen, Guolong Zheng, Xu Yang, Zhide Chen, Jiwu Shu, Wencheng Yang, Kexin Zhu, Chen Feng,
Abstract summary: This paper presents an unlabeled Multi-Dimensional Exercise Distance Adaptive Constrained Dynamic Time Warping (MED-ACDTW) method for action quality assessment. Our approach uses both 2D and 3D spatial dimensions, along with multiple human body features, to compare features from template and test videos. The adaptive constraint scheme enhances the discriminability of action quality assessment by approximately 30%.
Score: 12.639728404278255
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The growing popularity of online sports and exercise necessitates effective methods for evaluating the quality of online exercise executions. Previous action quality assessment methods, which relied on labeled scores from motion videos, exhibited slightly lower accuracy and discriminability. This limitation hindered their rapid application to newly added exercises. To address this problem, this paper presents an unlabeled Multi-Dimensional Exercise Distance Adaptive Constrained Dynamic Time Warping (MED-ACDTW) method for action quality assessment. Our approach uses an athletic version of DTW to compare features from template and test videos, eliminating the need for score labels during training. The result shows that utilizing both 2D and 3D spatial dimensions, along with multiple human body features, improves the accuracy by 2-3% compared to using either 2D or 3D pose estimation alone. Additionally, employing MED for score calculation enhances the precision of frame distance matching, which significantly boosts overall discriminability. The adaptive constraint scheme enhances the discriminability of action quality assessment by approximately 30%. Furthermore, to address the absence of a standardized perspective in sports class evaluations, we introduce a new dataset called BGym.

Related papers

Real-Time Feedback and Benchmark Dataset for Isometric Pose Evaluation [1.6358813089575626]
We present a real-time feedback system for assessing poses.<n>Our contributions include the release of the largest multiclass isometric exercise video dataset to date.<n>Results enhance the feasibility of intelligent and personalized exercise training systems for home workouts.
arXiv Detail & Related papers (2025-06-13T13:33:59Z)
Multi-person Physics-based Pose Estimation for Combat Sports [0.689728655482787]
We propose a novel framework for accurate 3D human pose estimation in combat sports using sparse multi-camera setups. Our method integrates robust multi-view 2D pose tracking via a transformer-based top-down approach. We further enhance pose realism and robustness by introducing a multi-person physics-based trajectory optimization step.
arXiv Detail & Related papers (2025-04-11T00:08:14Z)
Action Quality Assessment via Hierarchical Pose-guided Multi-stage Contrastive Regression [25.657978409890973]
Action Assessment (AQA) aims at automatic and fair evaluation of athletic performance. Current methods focus on segmenting video into fixed frames, which disrupts the temporal continuity of sub-actions. We propose a novel action quality assessment method through hierarchically pose-guided multi-stage contrastive regression.
arXiv Detail & Related papers (2025-01-07T10:20:16Z)
ALOcc: Adaptive Lifting-based 3D Semantic Occupancy and Cost Volume-based Flow Prediction [89.89610257714006]
Existing methods prioritize higher accuracy to cater to the demands of these tasks. We introduce a series of targeted improvements for 3D semantic occupancy prediction and flow estimation. Our purelytemporalal architecture framework, named ALOcc, achieves an optimal tradeoff between speed and accuracy.
arXiv Detail & Related papers (2024-11-12T11:32:56Z)
DynaWeightPnP: Toward global real-time 3D-2D solver in PnP without correspondences [7.191124861153032]
This paper addresses a special Perspective-n-Point (Weight) problem: estimating the optimal pose to align 3D and 2D shapes in real-time without correspondences. Experiments were conducted on a typical case, that is, a 3D-2D centerline registration task within Endovascular Image-Guided Interventions. Results demonstrated that the proposed algorithm achieves registration processing rates of 60 Hz (without post-refinement) and 31 (with post-refinement) with competitive accuracy comparable to existing methods.
arXiv Detail & Related papers (2024-09-27T05:31:33Z)
Uncertainty-Aware Testing-Time Optimization for 3D Human Pose Estimation [68.75387874066647]
We propose an Uncertainty-Aware testing-time optimization framework for 3D human pose estimation. Our approach outperforms the previous best result by a large margin of 4.5% on Human3.6M.
arXiv Detail & Related papers (2024-02-04T04:28:02Z)
FILP-3D: Enhancing 3D Few-shot Class-incremental Learning with Pre-trained Vision-Language Models [62.663113296987085]
Few-shot class-incremental learning aims to mitigate the catastrophic forgetting issue when a model is incrementally trained on limited data. We introduce two novel components: the Redundant Feature Eliminator (RFE) and the Spatial Noise Compensator (SNC) Considering the imbalance in existing 3D datasets, we also propose new evaluation metrics that offer a more nuanced assessment of a 3D FSCIL model.
arXiv Detail & Related papers (2023-12-28T14:52:07Z)
Geometry-Aware Video Quality Assessment for Dynamic Digital Human [56.17852258306602]
We propose a novel no-reference (NR) geometry-aware video quality assessment method for DDH-QA challenge. The proposed method achieves state-of-the-art performance on the DDH-QA database.
arXiv Detail & Related papers (2023-10-24T16:34:03Z)
Revisiting Domain-Adaptive 3D Object Detection by Reliable, Diverse and Class-balanced Pseudo-Labeling [38.07637524378327]
Unsupervised domain adaptation (DA) with the aid of pseudo labeling techniques has emerged as a crucial approach for domain-adaptive 3D object detection. Existing DA methods suffer from a substantial drop in performance when applied to a multi-class training setting. We propose a novel ReDB framework tailored for learning to detect all classes at once.
arXiv Detail & Related papers (2023-07-16T04:34:11Z)
LocATe: End-to-end Localization of Actions in 3D with Transformers [91.28982770522329]
LocATe is an end-to-end approach that jointly localizes and recognizes actions in a 3D sequence. Unlike transformer-based object-detection and classification models which consider image or patch features as input, LocATe's transformer model is capable of capturing long-term correlations between actions in a sequence. We introduce a new, challenging, and more realistic benchmark dataset, BABEL-TAL-20 (BT20), where the performance of state-of-the-art methods is significantly worse.
arXiv Detail & Related papers (2022-03-21T03:35:32Z)
Activation to Saliency: Forming High-Quality Labels for Unsupervised Salient Object Detection [54.92703325989853]
We propose a two-stage Activation-to-Saliency (A2S) framework that effectively generates high-quality saliency cues. No human annotations are involved in our framework during the whole training process. Our framework reports significant performance compared with existing USOD methods.
arXiv Detail & Related papers (2021-12-07T11:54:06Z)
Reduced Reference Perceptual Quality Model and Application to Rate Control for 3D Point Cloud Compression [61.110938359555895]
In rate-distortion optimization, the encoder settings are determined by maximizing a reconstruction quality measure subject to a constraint on the bit rate. We propose a linear perceptual quality model whose variables are the V-PCC geometry and color quantization parameters. Subjective quality tests with 400 compressed 3D point clouds show that the proposed model correlates well with the mean opinion score. We show that for the same target bit rate, ratedistortion optimization based on the proposed model offers higher perceptual quality than rate-distortion optimization based on exhaustive search with a point-to-point objective quality metric.
arXiv Detail & Related papers (2020-11-25T12:42:02Z)
Multi-Scale Networks for 3D Human Pose Estimation with Inference Stage Optimization [33.02708860641971]
Estimating 3D human poses from a monocular video is still a challenging task. Many existing methods drop when the target person is cluded by other objects, or the motion is too fast/slow relative to the scale and speed of the training data. We introduce atemporal-temporal network for robust 3D human pose estimation.
arXiv Detail & Related papers (2020-10-13T15:24:28Z)
A review of 3D human pose estimation algorithms for markerless motion capture [0.0]
We review the leading human pose estimation methods of the past five years, focusing on metrics, benchmarks and method structures. We propose a taxonomy based on accuracy, speed and robustness that we use to classify de methods and derive directions for future research.
arXiv Detail & Related papers (2020-10-13T15:07:01Z)
3D Human Pose Estimation using Spatio-Temporal Networks with Explicit Occlusion Training [40.933783830017035]
Estimating 3D poses from a monocular task is still a challenging task, despite the significant progress that has been made in recent years. We introduce a-temporal video network for robust 3D human pose estimation. We apply multi-scale spatial features for 2D joints or keypoints prediction in each individual frame, and multistride temporal convolutional net-works (TCNs) to estimate 3D joints or keypoints.
arXiv Detail & Related papers (2020-04-07T09:12:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.