A Spatial-Temporal Transformer based Framework For Human Pose Assessment
And Correction in Education Scenarios
- URL: http://arxiv.org/abs/2311.00401v1
- Date: Wed, 1 Nov 2023 09:53:38 GMT
- Title: A Spatial-Temporal Transformer based Framework For Human Pose Assessment
And Correction in Education Scenarios
- Authors: Wenyang Hu, Kai Liu, Libin Liu, Huiliang Shang
- Abstract summary: The framework comprises skeletal tracking, pose estimation, posture assessment, and posture correction modules.
We create a pose correction method to provide corrective feedback in the form of visual aids.
Results show that our model can effectively measure and comment on the quality of students' actions.
- Score: 6.146739983645156
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human pose assessment and correction play a crucial role in applications
across various fields, including computer vision, robotics, sports analysis,
healthcare, and entertainment. In this paper, we propose a Spatial-Temporal
Transformer based Framework (STTF) for human pose assessment and correction in
education scenarios such as physical exercises and science experiment. The
framework comprising skeletal tracking, pose estimation, posture assessment,
and posture correction modules to educate students with professional,
quick-to-fix feedback. We also create a pose correction method to provide
corrective feedback in the form of visual aids. We test the framework with our
own dataset. It comprises (a) new recordings of five exercises, (b) existing
recordings found on the internet of the same exercises, and (c) corrective
feedback on the recordings by professional athletes and teachers. Results show
that our model can effectively measure and comment on the quality of students'
actions. The STTF leverages the power of transformer models to capture spatial
and temporal dependencies in human poses, enabling accurate assessment and
effective correction of students' movements.
Related papers
- TEDRA: Text-based Editing of Dynamic and Photoreal Actors [59.480513384611804]
TEDRA is the first method allowing text-based edits of an avatar.
We train a model to create a controllable and high-fidelity digital replica of the real actor.
We modify the dynamic avatar based on a provided text prompt.
arXiv Detail & Related papers (2024-08-28T17:59:02Z) - Learning Action and Reasoning-Centric Image Editing from Videos and Simulations [45.637947364341436]
AURORA dataset is a collection of high-quality training data, human-annotated and curated from videos and simulation engines.
We evaluate an AURORA-finetuned model on a new expert-curated benchmark covering 8 diverse editing tasks.
Our model significantly outperforms previous editing models as judged by human raters.
arXiv Detail & Related papers (2024-07-03T19:36:33Z) - D-STGCNT: A Dense Spatio-Temporal Graph Conv-GRU Network based on
transformer for assessment of patient physical rehabilitation [0.3626013617212666]
This paper introduces a new graph-based model for assessing rehabilitation exercises.
Dense connections and GRU mechanisms are used to rapidly process large 3D skeleton inputs.
The evaluation of our proposed approach on the KIMORE and UI-PRMD datasets highlighted its potential.
arXiv Detail & Related papers (2023-12-21T00:38:31Z) - Adaptive Correspondence Scoring for Unsupervised Medical Image Registration [9.294341405888158]
Existing methods rely on image reconstruction as the primary supervision signal.
We propose an adaptive framework that re-weights the error residuals with a correspondence scoring map during training.
Our framework consistently outperforms other methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2023-12-01T01:11:22Z) - 3D Pose Based Feedback for Physical Exercises [87.35086507661227]
We introduce a learning-based framework that identifies the mistakes made by a user.
Our framework does not rely on hard-coded rules, instead, it learns them from data.
Our approach yields 90.9% mistake identification accuracy and successfully corrects 94.2% of the mistakes.
arXiv Detail & Related papers (2022-08-05T16:15:02Z) - Domain Knowledge-Informed Self-Supervised Representations for Workout
Form Assessment [12.040334568268445]
We propose to learn exercise-specific representations from unlabeled samples.
In particular, our domain knowledge-informed self-supervised approaches exploit the harmonic motion of the exercise actions.
We show that our self-supervised representations outperform off-the-shelf 2D- and 3D-pose estimators.
arXiv Detail & Related papers (2022-02-28T18:40:02Z) - FixMyPose: Pose Correctional Captioning and Retrieval [67.20888060019028]
We introduce a new captioning dataset named FixMyPose to address automated pose correction systems.
We collect descriptions of correcting a "current" pose to look like a "target" pose.
To avoid ML biases, we maintain a balance across characters with diverse demographics.
arXiv Detail & Related papers (2021-04-04T21:45:44Z) - Learning to Reweight with Deep Interactions [104.68509759134878]
We propose an improved data reweighting algorithm, in which the student model provides its internal states to the teacher model.
Experiments on image classification with clean/noisy labels and neural machine translation empirically demonstrate that our algorithm makes significant improvement over previous methods.
arXiv Detail & Related papers (2020-07-09T09:06:31Z) - Motion Pyramid Networks for Accurate and Efficient Cardiac Motion
Estimation [51.72616167073565]
We propose Motion Pyramid Networks, a novel deep learning-based approach for accurate and efficient cardiac motion estimation.
We predict and fuse a pyramid of motion fields from multiple scales of feature representations to generate a more refined motion field.
We then use a novel cyclic teacher-student training strategy to make the inference end-to-end and further improve the tracking performance.
arXiv Detail & Related papers (2020-06-28T21:03:19Z) - Deformation-aware Unpaired Image Translation for Pose Estimation on
Laboratory Animals [56.65062746564091]
We aim to capture the pose of neuroscience model organisms, without using any manual supervision, to study how neural circuits orchestrate behaviour.
Our key contribution is the explicit and independent modeling of appearance, shape and poses in an unpaired image translation framework.
We demonstrate improved pose estimation accuracy on Drosophila melanogaster (fruit fly), Caenorhabditis elegans (worm) and Danio rerio (zebrafish)
arXiv Detail & Related papers (2020-01-23T15:34:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.