Related papers: AfroBeats Dance Movement Analysis Using Computer Vision: A Proof-of-Concept Framework Combining YOLO and Segment Anything Model

AfroBeats Dance Movement Analysis Using Computer Vision: A Proof-of-Concept Framework Combining YOLO and Segment Anything Model

URL: http://arxiv.org/abs/2512.03509v1
Date: Wed, 03 Dec 2025 07:06:06 GMT
Title: AfroBeats Dance Movement Analysis Using Computer Vision: A Proof-of-Concept Framework Combining YOLO and Segment Anything Model
Authors: Kwaku Opoku-Ware, Gideon Opoku,
Abstract summary: We propose a proof-of-concept framework that integrates YOLOv8 and v11 for dancer detection with the Segment Anything Model (SAM) for precise segmentation.<n>Our approach identifies dancers within video frames, counts discrete dance steps, calculates spatial coverage patterns, and measures rhythm consistency across performance sequences.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents a preliminary investigation into automated dance movement analysis using contemporary computer vision techniques. We propose a proof-of-concept framework that integrates YOLOv8 and v11 for dancer detection with the Segment Anything Model (SAM) for precise segmentation, enabling the tracking and quantification of dancer movements in video recordings without specialized equipment or markers. Our approach identifies dancers within video frames, counts discrete dance steps, calculates spatial coverage patterns, and measures rhythm consistency across performance sequences. Testing this framework on a single 49-second recording of Ghanaian AfroBeats dance demonstrates technical feasibility, with the system achieving approximately 94% detection precision and 89% recall on manually inspected samples. The pixel-level segmentation provided by SAM, achieving approximately 83% intersection-over-union with visual inspection, enables motion quantification that captures body configuration changes beyond what bounding-box approaches can represent. Analysis of this preliminary case study indicates that the dancer classified as primary by our system executed 23% more steps with 37% higher motion intensity and utilized 42% more performance space compared to dancers classified as secondary. However, this work represents an early-stage investigation with substantial limitations including single-video validation, absence of systematic ground truth annotations, and lack of comparison with existing pose estimation methods. We present this framework to demonstrate technical feasibility, identify promising directions for quantitative dance metrics, and establish a foundation for future systematic validation studies.

Related papers

Emotion Recognition in Contemporary Dance Performances Using Laban Movement Analysis [0.562479170374811]
Our approach extracts expressive characteristics from 3D keypoints data of professional dancers performing contemporary dance under various emotional states.<n>We train multiple classifiers, including Random Forests and Support Vector Machines.<n>Overall, our study improves emotion recognition in contemporary dance and offers promising applications in performance analysis, dance training, and human--computer interaction, with a highest accuracy of 96.85%.
arXiv Detail & Related papers (2025-04-29T20:17:27Z)
SmurfCat at SemEval-2024 Task 6: Leveraging Synthetic Data for Hallucination Detection [51.99159169107426]
We present our novel systems developed for the SemEval-2024 hallucination detection task. Our investigation spans a range of strategies to compare model predictions with reference standards. We introduce three distinct methods that exhibit strong performance metrics.
arXiv Detail & Related papers (2024-04-09T09:03:44Z)
Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance Accompaniment [87.20240797625648]
We introduce a novel task within the field of 3D dance generation, termed dance accompaniment. It requires the generation of responsive movements from a dance partner, the "follower", synchronized with the lead dancer's movements and the underlying musical rhythm. We propose a GPT-based model, Duolando, which autoregressively predicts the subsequent tokenized motion conditioned on the coordinated information of the music, the leader's and the follower's movements.
arXiv Detail & Related papers (2024-03-27T17:57:02Z)
Component attention network for multimodal dance improvisation recognition [4.706373333495905]
This paper explores the application and performance of multimodal fusion methods for human motion recognition in the context of dance improvisation. We propose an attention-based model, component attention network (CANet), for multimodal fusion on three levels: 1) feature fusion with CANet, 2) model fusion with CANet and graph convolutional network (GCN), and 3) late fusion with a voting strategy.
arXiv Detail & Related papers (2023-08-24T15:04:30Z)
Dance with You: The Diversity Controllable Dancer Generation via Diffusion Models [27.82646255903689]
We introduce a novel multi-dancer synthesis task called partner dancer generation. The core of this task is to ensure the controllable diversity of the generated partner dancer. To address the lack of multi-person datasets, we introduce AIST-M, a new dataset for partner dancer generation.
arXiv Detail & Related papers (2023-08-23T15:54:42Z)
PMI Sampler: Patch Similarity Guided Frame Selection for Aerial Action Recognition [52.78234467516168]
We introduce the concept of patch mutual information (PMI) score to quantify the motion bias between adjacent frames. We present an adaptive frame selection strategy using shifted leaky ReLu and cumulative distribution function. Our method achieves a relative improvement of 2.2 - 13.8% in top-1 accuracy on UAV-Human, 6.8% on NEC Drone, and 9.0% on Diving48 datasets.
arXiv Detail & Related papers (2023-04-14T00:01:11Z)
GaitMM: Multi-Granularity Motion Sequence Learning for Gait Recognition [6.877671230651998]
Gait recognition aims to identify individual-specific walking patterns by observing the different periodic movements of each body part. Most existing methods treat each part equally and fail to account for the data redundancy caused by the different step frequencies and sampling rates of gait. In this study, we propose a multi-granularity motion representation (GaitMM) for gait sequence learning.
arXiv Detail & Related papers (2022-09-18T04:07:33Z)
BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis [123.73677487809418]
We introduce a new dataset aiming to challenge common assumptions in dance motion synthesis. We focus on breakdancing which features acrobatic moves and tangled postures. Our efforts produced the BRACE dataset, which contains over 3 hours and 30 minutes of densely annotated poses.
arXiv Detail & Related papers (2022-07-20T18:03:54Z)
Learning to Segment Rigid Motions from Two Frames [72.14906744113125]
We propose a modular network, motivated by a geometric analysis of what independent object motions can be recovered from an egomotion field. It takes two consecutive frames as input and predicts segmentation masks for the background and multiple rigidly moving objects, which are then parameterized by 3D rigid transformations. Our method achieves state-of-the-art performance for rigid motion segmentation on KITTI and Sintel.
arXiv Detail & Related papers (2021-01-11T04:20:30Z)
Ballroom Dance Movement Recognition Using a Smart Watch [0.0]
We present a whole body movement detection study using a single smart watch in the context of ballroom dancing. Deep learning representations are used to classify well-defined sequences of movements. The classification accuracy of 85.95% was improved to 92.31% by modeling a dance as a first-order Markov chain of figures.
arXiv Detail & Related papers (2020-08-23T22:36:28Z)
Naive-Student: Leveraging Semi-Supervised Learning in Video Sequences for Urban Scene Segmentation [57.68890534164427]
In this work, we ask if we may leverage semi-supervised learning in unlabeled video sequences and extra images to improve the performance on urban scene segmentation. We simply predict pseudo-labels for the unlabeled data and train subsequent models with both human-annotated and pseudo-labeled data. Our Naive-Student model, trained with such simple yet effective iterative semi-supervised learning, attains state-of-the-art results at all three Cityscapes benchmarks.
arXiv Detail & Related papers (2020-05-20T18:00:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.