BASKET: A Large-Scale Video Dataset for Fine-Grained Skill Estimation
- URL: http://arxiv.org/abs/2503.20781v1
- Date: Wed, 26 Mar 2025 17:59:02 GMT
- Title: BASKET: A Large-Scale Video Dataset for Fine-Grained Skill Estimation
- Authors: Yulu Pan, Ce Zhang, Gedas Bertasius,
- Abstract summary: BASKET contains 4,477 hours of video capturing 32,232 basketball players from all over the world.<n>Our dataset includes a massive number of skilled participants with unprecedented diversity in terms of gender, age, skill level, geographical location.
- Score: 23.088610274138848
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present BASKET, a large-scale basketball video dataset for fine-grained skill estimation. BASKET contains 4,477 hours of video capturing 32,232 basketball players from all over the world. Compared to prior skill estimation datasets, our dataset includes a massive number of skilled participants with unprecedented diversity in terms of gender, age, skill level, geographical location, etc. BASKET includes 20 fine-grained basketball skills, challenging modern video recognition models to capture the intricate nuances of player skill through in-depth video analysis. Given a long highlight video (8-10 minutes) of a particular player, the model needs to predict the skill level (e.g., excellent, good, average, fair, poor) for each of the 20 basketball skills. Our empirical analysis reveals that the current state-of-the-art video models struggle with this task, significantly lagging behind the human baseline. We believe that BASKET could be a useful resource for developing new video models with advanced long-range, fine-grained recognition capabilities. In addition, we hope that our dataset will be useful for domain-specific applications such as fair basketball scouting, personalized player development, and many others. Dataset and code are available at https://github.com/yulupan00/BASKET.
Related papers
- SkillMimic: Learning Reusable Basketball Skills from Demonstrations [85.23012579911378]
We propose SkillMimic, a data-driven approach that mimics both human and ball motions to learn a wide variety of basketball skills.
SkillMimic employs a unified configuration to learn diverse skills from human-ball motion datasets.
The skills acquired by SkillMimic can be easily reused by a high-level controller to accomplish complex basketball tasks.
arXiv Detail & Related papers (2024-08-12T15:19:04Z) - ExpertAF: Expert Actionable Feedback from Video [81.46431188306397]
Current methods for skill-assessment from video only provide scores or compare demonstrations.<n>We introduce a novel method to generate actionable feedback from video of a person doing a physical activity.<n>Our method is able to reason across multi-modal input combinations to output full-spectrum, actionable coaching.
arXiv Detail & Related papers (2024-08-01T16:13:07Z) - Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding [49.88140766026886]
State space model, Mamba, shows promising traits to extend its success in long sequence modeling to video modeling.
We conduct a comprehensive set of studies, probing different roles Mamba can play in modeling videos, while investigating diverse tasks where Mamba could exhibit superiority.
Our experiments reveal the strong potential of Mamba on both video-only and video-language tasks while showing promising efficiency-performance trade-offs.
arXiv Detail & Related papers (2024-03-14T17:57:07Z) - Knowledge Guided Entity-aware Video Captioning and A Basketball
Benchmark [49.54265459763042]
We construct a multimodal basketball game knowledge graph (KG_NBA_2022) to provide additional knowledge beyond videos.
Then, a dataset that contains 9 types of fine-grained shooting events and 286 players' knowledge is constructed based on KG_NBA_2022.
We develop a knowledge guided entity-aware video captioning network (KEANet) based on a candidate player list in encoder-decoder form for basketball live text broadcast.
arXiv Detail & Related papers (2024-01-25T02:08:37Z) - GOAL: A Challenging Knowledge-grounded Video Captioning Benchmark for
Real-time Soccer Commentary Generation [75.60413443783953]
We present GOAL, a benchmark of over 8.9k soccer video clips, 22k sentences, and 42k knowledge triples for proposing a challenging new task setting as Knowledge-grounded Video Captioning (KGVC)
Our data and code are available at https://github.com/THU-KEG/goal.
arXiv Detail & Related papers (2023-03-26T08:43:36Z) - NBA2Vec: Dense feature representations of NBA players [0.0]
We present NBA2Vec, a neural network model based on Word2Vec which extracts dense feature representations of each player.
NBA2Vec accurately predicts the outcomes to various 2017 NBA Playoffs series.
Future applications of NBA2Vec embeddings to characterize players' style may revolutionize predictive models for player acquisition and coaching decisions.
arXiv Detail & Related papers (2023-02-26T19:05:57Z) - Can a face tell us anything about an NBA prospect? -- A Deep Learning
approach [0.0]
We deploy image analysis and Convolutional Neural Networks in an attempt to predict the career trajectory of newly drafted players from each draft class.
We created a database consisting of about 1500 image data from players from every draft since 1990.
We trained popular pre-trained image classification models in our data and conducted a series of tests in an attempt to create models that give reliable predictions of the rookie players' careers.
arXiv Detail & Related papers (2022-12-13T18:36:29Z) - A Survey on Video Action Recognition in Sports: Datasets, Methods and
Applications [60.3327085463545]
We present a survey on video action recognition for sports analytics.
We introduce more than ten types of sports, including team sports, such as football, basketball, volleyball, hockey and individual sports, such as figure skating, gymnastics, table tennis, diving and badminton.
We develop a toolbox using PaddlePaddle, which supports football, basketball, table tennis and figure skating action recognition.
arXiv Detail & Related papers (2022-06-02T13:19:36Z) - CLIP meets GamePhysics: Towards bug identification in gameplay videos
using zero-shot transfer learning [4.168157981135698]
We propose a search method that accepts any English text query as input to retrieve relevant gameplay videos.
Our approach does not rely on any external information (such as video metadata)
An example application of our approach is as a gameplay video search engine to aid in reproducing video game bugs.
arXiv Detail & Related papers (2022-03-21T16:23:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.