Towards Ball Spin and Trajectory Analysis in Table Tennis Broadcast Videos via Physically Grounded Synthetic-to-Real Transfer
- URL: http://arxiv.org/abs/2504.19863v1
- Date: Mon, 28 Apr 2025 14:55:12 GMT
- Title: Towards Ball Spin and Trajectory Analysis in Table Tennis Broadcast Videos via Physically Grounded Synthetic-to-Real Transfer
- Authors: Daniel Kienzle, Robin Schön, Rainer Lienhart, Shin'Ichi Satoh,
- Abstract summary: We present a novel method to infer the initial spin and 3D trajectory from the corresponding 2D trajectory in a video.<n>We are the first to present a method for spin and trajectory prediction in simple monocular broadcast videos.
- Score: 19.022628890838792
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Analyzing a player's technique in table tennis requires knowledge of the ball's 3D trajectory and spin. While, the spin is not directly observable in standard broadcasting videos, we show that it can be inferred from the ball's trajectory in the video. We present a novel method to infer the initial spin and 3D trajectory from the corresponding 2D trajectory in a video. Without ground truth labels for broadcast videos, we train a neural network solely on synthetic data. Due to the choice of our input data representation, physically correct synthetic training data, and using targeted augmentations, the network naturally generalizes to real data. Notably, these simple techniques are sufficient to achieve generalization. No real data at all is required for training. To the best of our knowledge, we are the first to present a method for spin and trajectory prediction in simple monocular broadcast videos, achieving an accuracy of 92.0% in spin classification and a 2D reprojection error of 0.19% of the image diagonal.
Related papers
- TT3D: Table Tennis 3D Reconstruction [11.84899291358663]
We propose a novel approach for reconstructing precise 3D ball trajectories from online table tennis match recordings.<n>Our method leverages the underlying physics of the ball's motion to identify the bounce state that minimizes the reprojection error of the ball's flying trajectory.<n>A key advantage of our approach is its ability to infer ball spin without relying on human pose estimation or racket tracking.
arXiv Detail & Related papers (2025-04-14T09:37:47Z) - Pose2Trajectory: Using Transformers on Body Pose to Predict Tennis Player's Trajectory [6.349503549199403]
We propose Pose2Trajectory, which predicts a tennis player's future trajectory as a sequence derived from their body joints' data and ball position.
We use encoder-decoder Transformer architecture trained on the joints and trajectory information of the players with ball positions.
We generate a high-quality dataset from multiple videos to assist tennis player movement prediction using object detection and human pose estimation methods.
arXiv Detail & Related papers (2024-11-07T07:50:58Z) - Neural Network-Based Tracking and 3D Reconstruction of Baseball Pitch Trajectories from Single-View 2D Video [0.0]
We present a neural network-based approach for tracking and reconstructing the trajectories of baseball pitches from 2D video footage to 3D coordinates.
Our experimental results demonstrate that this approach achieves high accuracy in reconstructing 3D trajectories from 2D inputs.
arXiv Detail & Related papers (2024-05-25T16:17:10Z) - CNN-based Game State Detection for a Foosball Table [1.612440288407791]
In the game of Foosball, a compact and comprehensive game state description consists of the positional shifts and rotations of the figures and the position of the ball over time.
In this paper, a figure detection system to determine the game state in Foosball is presented.
This dataset is utilized to train Convolutional Neural Network (CNN) based end-to-end regression models to predict the rotations and shifts of each rod.
arXiv Detail & Related papers (2024-04-08T09:48:02Z) - Refining Pre-Trained Motion Models [56.18044168821188]
We take on the challenge of improving state-of-the-art supervised models with self-supervised training.
We focus on obtaining a "clean" training signal from real-world unlabelled video.
We show that our method yields reliable gains over fully-supervised methods in real videos.
arXiv Detail & Related papers (2024-01-01T18:59:33Z) - Any-point Trajectory Modeling for Policy Learning [64.23861308947852]
We introduce Any-point Trajectory Modeling (ATM) to predict future trajectories of arbitrary points within a video frame.
ATM outperforms strong video pre-training baselines by 80% on average.
We show effective transfer learning of manipulation skills from human videos and videos from a different robot morphology.
arXiv Detail & Related papers (2023-12-28T23:34:43Z) - Transferring Learning Trajectories of Neural Networks [2.2299983745857896]
Training deep neural networks (DNNs) is computationally expensive.
We formulate the problem of "transferring" a given learning trajectory from one initial parameter to another one.
We empirically show that the transferred parameters achieve non-trivial accuracy before any direct training, and can be trained significantly faster than training from scratch.
arXiv Detail & Related papers (2023-05-23T14:46:32Z) - TAP-Vid: A Benchmark for Tracking Any Point in a Video [84.94877216665793]
We formalize the problem of tracking arbitrary physical points on surfaces over longer video clips, naming it tracking any point (TAP)
We introduce a companion benchmark, TAP-Vid, which is composed of both real-world videos with accurate human annotations of point tracks, and synthetic videos with perfect ground-truth point tracks.
We propose a simple end-to-end point tracking model TAP-Net, showing that it outperforms all prior methods on our benchmark when trained on synthetic data.
arXiv Detail & Related papers (2022-11-07T17:57:02Z) - What Stops Learning-based 3D Registration from Working in the Real
World? [53.68326201131434]
This work identifies the sources of 3D point cloud registration failures, analyze the reasons behind them, and propose solutions.
Ultimately, this translates to a best-practice 3D registration network (BPNet), constituting the first learning-based method able to handle previously-unseen objects in real-world data.
Our model generalizes to real data without any fine-tuning, reaching an accuracy of up to 67% on point clouds of unseen objects obtained with a commercial sensor.
arXiv Detail & Related papers (2021-11-19T19:24:27Z) - Contact and Human Dynamics from Monocular Video [73.47466545178396]
Existing deep models predict 2D and 3D kinematic poses from video that are approximately accurate, but contain visible errors.
We present a physics-based method for inferring 3D human motion from video sequences that takes initial 2D and 3D pose estimates as input.
arXiv Detail & Related papers (2020-07-22T21:09:11Z) - AutoTrajectory: Label-free Trajectory Extraction and Prediction from
Videos using Dynamic Points [92.91569287889203]
We present a novel, label-free algorithm, AutoTrajectory, for trajectory extraction and prediction.
To better capture the moving objects in videos, we introduce dynamic points.
We aggregate dynamic points to instance points, which stand for moving objects such as pedestrians in videos.
arXiv Detail & Related papers (2020-07-11T08:43:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.