RacketVision: A Multiple Racket Sports Benchmark for Unified Ball and Racket Analysis
- URL: http://arxiv.org/abs/2511.17045v2
- Date: Thu, 27 Nov 2025 05:13:43 GMT
- Title: RacketVision: A Multiple Racket Sports Benchmark for Unified Ball and Racket Analysis
- Authors: Linfeng Dong, Yuchen Yang, Hao Wu, Wei Wang, Yuenan Hou, Zhihang Zhong, Xiao Sun,
- Abstract summary: RacketVision is a novel dataset for advancing computer vision in sports analytics.<n>It provides large-scale, fine-grained annotations for racket pose alongside traditional ball positions.<n>It is designed to tackle three interconnected tasks: fine-grained ball tracking, articulated racket pose estimation, and predictive ball trajectory forecasting.
- Score: 27.850804936572896
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce RacketVision, a novel dataset and benchmark for advancing computer vision in sports analytics, covering table tennis, tennis, and badminton. The dataset is the first to provide large-scale, fine-grained annotations for racket pose alongside traditional ball positions, enabling research into complex human-object interactions. It is designed to tackle three interconnected tasks: fine-grained ball tracking, articulated racket pose estimation, and predictive ball trajectory forecasting. Our evaluation of established baselines reveals a critical insight for multi-modal fusion: while naively concatenating racket pose features degrades performance, a CrossAttention mechanism is essential to unlock their value, leading to trajectory prediction results that surpass strong unimodal baselines. RacketVision provides a versatile resource and a strong starting point for future research in dynamic object tracking, conditional motion forecasting, and multimodal analysis in sports. Project page at https://github.com/OrcustD/RacketVision
Related papers
- SoccerMaster: A Vision Foundation Model for Soccer Understanding [50.88251190999469]
Soccer understanding has recently garnered growing research interest due to its domain-specific complexity and unique challenges.<n>This work aims to propose a unified model to handle diverse soccer visual understanding tasks, ranging from fine-grained perception to semantic reasoning.<n>We present SoccerMaster, the first soccer-specific vision foundation model that unifies diverse understanding tasks within a single framework.
arXiv Detail & Related papers (2025-12-11T18:03:30Z) - SportR: A Benchmark for Multimodal Large Language Model Reasoning in Sports [21.410115837645318]
SportR is the first multi-sports large-scale benchmark designed to train and evaluate MLLMs on the fundamental reasoning required for sports intelligence.<n>Our benchmark provides a dataset of 5,017 images and 2,101 videos.<n>For the most advanced tasks requiring multi-step reasoning, such as determining penalties or explaining tactics, we provide 7,118 high-quality, human-authored Chain of Thought annotations.
arXiv Detail & Related papers (2025-11-09T18:55:20Z) - Action Anticipation from SoccerNet Football Video Broadcasts [84.87912817065506]
We introduce the task of action anticipation for football broadcast videos.<n>We predict future actions in unobserved future frames within a five- or ten-second anticipation window.<n>Our work will enable applications in automated broadcasting, tactical analysis, and player decision-making.
arXiv Detail & Related papers (2025-04-16T12:24:33Z) - TrackID3x3: A Dataset and Algorithm for Multi-Player Tracking with Identification and Pose Estimation in 3x3 Basketball Full-court Videos [8.70594963462731]
We propose the first dataset specifically designed for multi-player tracking, player identification, and pose estimation in 3x3 basketball scenarios.<n>The dataset comprises three distinct subsets (Indoor fixed-camera, Outdoor fixed-camera, and Drone camera footage), capturing diverse full-court camera perspectives and environments.<n>To evaluate performance, we propose a baseline algorithm called Track-ID algorithm, tailored to assess tracking and identification quality.
arXiv Detail & Related papers (2025-03-24T01:55:46Z) - Towards long-term player tracking with graph hierarchies and domain-specific features [5.985204759362746]
We introduce SportsSUSHI, a hierarchical graph-based approach that leverages domain-specific features, including jersey numbers, team IDs, and field coordinates, to enhance tracking accuracy.<n>SportsSUSHI achieves high performance on the SoccerNet dataset and a newly proposed hockey tracking dataset.
arXiv Detail & Related papers (2025-02-28T17:12:40Z) - ShuttleSHAP: A Turn-Based Feature Attribution Approach for Analyzing
Forecasting Models in Badminton [52.21869064818728]
Deep learning approaches for player tactic forecasting in badminton show promising performance partially attributed to effective reasoning about rally-player interactions.
We propose a turn-based feature attribution approach, ShuttleSHAP, for analyzing forecasting models in badminton based on variants of Shapley values.
arXiv Detail & Related papers (2023-12-18T05:37:51Z) - Estimation of control area in badminton doubles with pose information
from top and back view drone videos [11.679451300997016]
We present the first annotated drone dataset from top and back views in badminton doubles.
We propose a framework to estimate the control area probability map, which can be used to evaluate teamwork performance.
arXiv Detail & Related papers (2023-05-07T11:18:39Z) - SoccerNet-Tracking: Multiple Object Tracking Dataset and Benchmark in
Soccer Videos [62.686484228479095]
We propose a novel dataset for multiple object tracking composed of 200 sequences of 30s each.
The dataset is fully annotated with bounding boxes and tracklet IDs.
Our analysis shows that multiple player, referee and ball tracking in soccer videos is far from being solved.
arXiv Detail & Related papers (2022-04-14T12:22:12Z) - ShuttleNet: Position-aware Fusion of Rally Progress and Player Styles
for Stroke Forecasting in Badminton [18.524164548051417]
This paper focuses on objectively judging what and where to return strokes in turn-based sports.
We propose a novel Position-aware Fusion of Rally Progress and Player Styles framework (ShuttleNet) that incorporates rally progress and information of the players.
arXiv Detail & Related papers (2021-12-02T08:14:23Z) - PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object
Detection [57.49788100647103]
LiDAR-based 3D object detection is an important task for autonomous driving.
Current approaches suffer from sparse and partial point clouds of distant and occluded objects.
In this paper, we propose a novel two-stage approach, namely PC-RGNN, dealing with such challenges by two specific solutions.
arXiv Detail & Related papers (2020-12-18T18:06:43Z) - TAO: A Large-Scale Benchmark for Tracking Any Object [95.87310116010185]
Tracking Any Object dataset consists of 2,907 high resolution videos, captured in diverse environments, which are half a minute long on average.
We ask annotators to label objects that move at any point in the video, and give names to them post factum.
Our vocabulary is both significantly larger and qualitatively different from existing tracking datasets.
arXiv Detail & Related papers (2020-05-20T21:07:28Z) - Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction [57.56466850377598]
Reasoning over visual data is a desirable capability for robotics and vision-based applications.
In this paper, we present a framework on graph to uncover relationships in different objects in the scene for reasoning about pedestrian intent.
Pedestrian intent, defined as the future action of crossing or not-crossing the street, is a very crucial piece of information for autonomous vehicles.
arXiv Detail & Related papers (2020-02-20T18:50:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.