BoxMAC -- A Boxing Dataset for Multi-label Action Classification
- URL: http://arxiv.org/abs/2412.18204v2
- Date: Mon, 17 Feb 2025 10:01:36 GMT
- Title: BoxMAC -- A Boxing Dataset for Multi-label Action Classification
- Authors: Shashikanta Sahoo,
- Abstract summary: BoxMAC is a real-world boxing dataset featuring 15 professional boxers and 13 distinct action labels.<n>We propose a novel architecture for jointly recognizing multiple actions in both individual images and videos.<n>BoxMAC can serve as a valuable resource for the advancement of boxing as a sport.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In competitive combat sports like boxing, analyzing a boxers's performance statics is crucial for evaluating the quantity and variety of punches delivered during bouts. These statistics provide valuable data and feedback, which are routinely used for coaching and performance enhancement. We introduce BoxMAC, a real-world boxing dataset featuring 15 professional boxers and encompassing 13 distinct action labels. Comprising over 60,000 frames, our dataset has been meticulously annotated for multiple actions per frame with inputs from a boxing coach. Since two boxers can execute different punches within a single timestamp, this problem falls under the domain of multi-label action classification. We propose a novel architecture for jointly recognizing multiple actions in both individual images and videos. We investigate baselines using deep neural network architectures to address both tasks. We believe that BoxMAC will enable researchers and practitioners to develop and evaluate more efficient models for performance analysis. With its realistic and diverse nature, BoxMAC can serve as a valuable resource for the advancement of boxing as a sport
Related papers
- BoxMind: Closed-loop AI strategy optimization for elite boxing validated in the 2024 Olympics [25.895403161230515]
BoxMind is a closed-loop AI expert system validated in elite boxing competition.<n>BoxMind is validated through a closed-loop deployment during the 2024 Paris Olympics.
arXiv Detail & Related papers (2026-01-16T18:14:46Z) - BoxingVI: A Multi-Modal Benchmark for Boxing Action Recognition and Localization [1.623267727687624]
We present a comprehensive, well-annotated video dataset tailored for punch detection and classification in boxing.<n>The dataset comprises 6,915 high-quality punch clips categorized into six distinct punch types.<n>This contribution aims to accelerate progress in movement analysis, automated coaching, and performance assessment within boxing and related domains.
arXiv Detail & Related papers (2025-11-20T16:37:07Z) - Scope Meets Screen: Lessons Learned in Designing Composite Visualizations for Marksmanship Training Across Skill Levels [3.345437353879255]
We present a shooting visualization system and evaluate its perceived effectiveness for both novice and expert shooters.<n>The insights gained from this design study point to the broader value of integrating first-person video with visual analytics for coaching.
arXiv Detail & Related papers (2025-07-01T00:16:41Z) - Wholly-WOOD: Wholly Leveraging Diversified-quality Labels for Weakly-supervised Oriented Object Detection [57.26265276035267]
Wholly-WOOD is a weakly-supervised OOD framework capable of wholly leveraging various labeling forms.
By only using HBox for training, our Wholly-WOOD achieves performance very close to that of the RBox-trained counterpart on remote sensing.
arXiv Detail & Related papers (2025-02-13T16:34:59Z) - FACTS: Fine-Grained Action Classification for Tactical Sports [4.810476621219244]
Classifying fine-grained actions in fast-paced, close-combat sports such as fencing and boxing presents unique challenges.<n>We introduce FACTS, a novel approach for fine-grained action recognition that processes raw video data directly.<n>Our findings enhance training, performance analysis, and spectator engagement, setting a new benchmark for action classification in tactical sports.
arXiv Detail & Related papers (2024-12-21T03:00:25Z) - Benchmarking Badminton Action Recognition with a New Fine-Grained Dataset [16.407837909069073]
We introduce the VideoBadminton dataset derived from high-quality badminton footage.
The introduction of VideoBadminton could not only serve for badminton action recognition but also provide a dataset for recognizing fine-grained actions.
arXiv Detail & Related papers (2024-03-19T02:52:06Z) - SportsMOT: A Large Multi-Object Tracking Dataset in Multiple Sports
Scenes [44.46768991505495]
We present a new large-scale multi-object tracking dataset in diverse sports scenes, coined as emphSportsMOT.
It consists of 240 video sequences, over 150K frames and over 1.6M bounding boxes collected from 3 sports categories, including basketball, volleyball and football.
We propose a new multi-object tracking framework, termed as emphMixSort, introducing a MixFormer-like structure as an auxiliary association model to prevailing tracking-by-detection trackers.
arXiv Detail & Related papers (2023-04-11T12:07:31Z) - H2RBox: Horizonal Box Annotation is All You Need for Oriented Object
Detection [63.66553556240689]
Oriented object detection emerges in many applications from aerial images to autonomous driving.
Many existing detection benchmarks are annotated with horizontal bounding box only which is also less costive than fine-grained rotated box.
This paper proposes a simple yet effective oriented object detection approach called H2RBox.
arXiv Detail & Related papers (2022-10-13T05:12:45Z) - P2ANet: A Dataset and Benchmark for Dense Action Detection from Table Tennis Match Broadcasting Videos [64.57435509822416]
This work consists of 2,721 video clips collected from the broadcasting videos of professional table tennis matches in World Table Tennis Championships and Olympiads.
We formulate two sets of action detection problems -- emphaction localization and emphaction recognition.
The results confirm that TheName is still a challenging task and can be used as a special benchmark for dense action detection from videos.
arXiv Detail & Related papers (2022-07-26T08:34:17Z) - A Survey on Video Action Recognition in Sports: Datasets, Methods and
Applications [60.3327085463545]
We present a survey on video action recognition for sports analytics.
We introduce more than ten types of sports, including team sports, such as football, basketball, volleyball, hockey and individual sports, such as figure skating, gymnastics, table tennis, diving and badminton.
We develop a toolbox using PaddlePaddle, which supports football, basketball, table tennis and figure skating action recognition.
arXiv Detail & Related papers (2022-06-02T13:19:36Z) - MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized
Sports Actions [39.27858380391081]
This paper aims to present a new multi-person dataset of atomic-temporal actions, coined as MultiSports.
We build the dataset of MultiSports v1.0 by selecting 4 sports classes, collecting around 3200 video clips, and annotating around 37790 action instances with 907k bounding boxes.
arXiv Detail & Related papers (2021-05-16T10:40:30Z) - Semi-Supervised Action Recognition with Temporal Contrastive Learning [50.08957096801457]
We learn a two-pathway temporal contrastive model using unlabeled videos at two different speeds.
We considerably outperform video extensions of sophisticated state-of-the-art semi-supervised image recognition methods.
arXiv Detail & Related papers (2021-02-04T17:28:35Z) - Generating Masks from Boxes by Mining Spatio-Temporal Consistencies in
Videos [159.02703673838639]
We introduce a method for generating segmentation masks from per-frame bounding box annotations in videos.
We use our resulting accurate masks for weakly supervised training of video object segmentation (VOS) networks.
The additional data provides substantially better generalization performance leading to state-of-the-art results in both the VOS and more challenging tracking domain.
arXiv Detail & Related papers (2021-01-06T18:56:24Z) - ScribbleBox: Interactive Annotation Framework for Video Object
Segmentation [62.86341611684222]
We introduce ScribbleBox, a novel interactive framework for annotating object instances with masks in videos.
Box tracks are annotated efficiently by approximating the trajectory using a parametric curve.
We show that our ScribbleBox approach reaches 88.92% J&F on DAVIS 2017 with 9.14 clicks per box track, and 4 frames of annotation.
arXiv Detail & Related papers (2020-08-22T00:33:10Z) - Event detection in coarsely annotated sports videos via parallel multi
receptive field 1D convolutions [14.30009544149561]
In problems such as sports video analytics, it is difficult to obtain accurate frame level annotations and exact event duration.
We propose the task of event detection in coarsely annotated videos.
We introduce a multi-tower temporal convolutional network architecture for the proposed task.
arXiv Detail & Related papers (2020-04-13T19:51:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.