Egocentric Event-Based Vision for Ping Pong Ball Trajectory Prediction
- URL: http://arxiv.org/abs/2506.07860v1
- Date: Mon, 09 Jun 2025 15:22:55 GMT
- Title: Egocentric Event-Based Vision for Ping Pong Ball Trajectory Prediction
- Authors: Ivan Alberico, Marco Cannici, Giovanni Cioffi, Davide Scaramuzza,
- Abstract summary: We present a real-time egocentric trajectory prediction system for table tennis using event cameras.<n>We collect a dataset of ping-pong game sequences, including 3D ground-truth trajectories of the ball, synchronized with sensor data from the Meta Project Aria glasses.<n>Our detection pipeline has a worst-case total latency of 4.5 ms, including computation and perception.
- Score: 17.147140984254655
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present a real-time egocentric trajectory prediction system for table tennis using event cameras. Unlike standard cameras, which suffer from high latency and motion blur at fast ball speeds, event cameras provide higher temporal resolution, allowing more frequent state updates, greater robustness to outliers, and accurate trajectory predictions using just a short time window after the opponent's impact. We collect a dataset of ping-pong game sequences, including 3D ground-truth trajectories of the ball, synchronized with sensor data from the Meta Project Aria glasses and event streams. Our system leverages foveated vision, using eye-gaze data from the glasses to process only events in the viewer's fovea. This biologically inspired approach improves ball detection performance and significantly reduces computational latency, as it efficiently allocates resources to the most perceptually relevant regions, achieving a reduction factor of 10.81 on the collected trajectories. Our detection pipeline has a worst-case total latency of 4.5 ms, including computation and perception - significantly lower than a frame-based 30 FPS system, which, in the worst case, takes 66 ms solely for perception. Finally, we fit a trajectory prediction model to the estimated states of the ball, enabling 3D trajectory forecasting in the future. To the best of our knowledge, this is the first approach to predict table tennis trajectories from an egocentric perspective using event cameras.
Related papers
- TT3D: Table Tennis 3D Reconstruction [11.84899291358663]
We propose a novel approach for reconstructing precise 3D ball trajectories from online table tennis match recordings.<n>Our method leverages the underlying physics of the ball's motion to identify the bounce state that minimizes the reprojection error of the ball's flying trajectory.<n>A key advantage of our approach is its ability to infer ball spin without relying on human pose estimation or racket tracking.
arXiv Detail & Related papers (2025-04-14T09:37:47Z) - An Event-Based Perception Pipeline for a Table Tennis Robot [12.101426862186072]
We present the first real-time perception pipeline for a table tennis robot that uses only event-based cameras.<n>We show that compared to a frame-based pipeline, event-based perception pipelines have an update rate which is an order of magnitude higher.
arXiv Detail & Related papers (2025-02-02T10:56:37Z) - Event-based Structure-from-Orbit [23.97673114572094]
Certain applications in robotics and vision-based navigation require 3D perception of an object undergoing circular or spinning motion in front of a static camera.
We propose event-based structure-from-orbit (eSf), where the aim is to reconstruct the 3D structure of a fast spinning object observed from a static event camera.
arXiv Detail & Related papers (2024-05-10T03:02:03Z) - Table tennis ball spin estimation with an event camera [11.735290341808064]
In table tennis, the combination of high velocity and spin renders traditional low frame rate cameras inadequate.
We present the first method for table tennis spin estimation using an event camera.
We achieve a spin magnitude mean error of $10.7 pm 17.3$ rps and a spin axis mean error of $32.9 pm 38.2deg$ in real time for a flying ball.
arXiv Detail & Related papers (2024-04-15T15:36:38Z) - Implicit Occupancy Flow Fields for Perception and Prediction in
Self-Driving [68.95178518732965]
A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants.
Existing works either perform object detection followed by trajectory of the detected objects, or predict dense occupancy and flow grids for the whole scene.
This motivates our unified approach to perception and future prediction that implicitly represents occupancy and flow over time with a single neural network.
arXiv Detail & Related papers (2023-08-02T23:39:24Z) - EV-Catcher: High-Speed Object Catching Using Low-latency Event-based
Neural Networks [107.62975594230687]
We demonstrate an application where event cameras excel: accurately estimating the impact location of fast-moving objects.
We introduce a lightweight event representation called Binary Event History Image (BEHI) to encode event data at low latency.
We show that the system is capable of achieving a success rate of 81% in catching balls targeted at different locations, with a velocity of up to 13 m/s even on compute-constrained embedded platforms.
arXiv Detail & Related papers (2023-04-14T15:23:28Z) - Fast Trajectory End-Point Prediction with Event Cameras for Reactive
Robot Control [4.110120522045467]
In this paper, we propose to exploit the low latency, motion-driven sampling, and data compression properties of event cameras to overcome these issues.
As a use-case, we use a Panda robotic arm to intercept a ball bouncing on a table.
We train the network in simulation to speed up the dataset acquisition and then fine-tune the models on real trajectories.
arXiv Detail & Related papers (2023-02-27T14:14:52Z) - Real-time Object Detection for Streaming Perception [84.2559631820007]
Streaming perception is proposed to jointly evaluate the latency and accuracy into a single metric for video online perception.
We build a simple and effective framework for streaming perception.
Our method achieves competitive performance on Argoverse-HD dataset and improves the AP by 4.9% compared to the strong baseline.
arXiv Detail & Related papers (2022-03-23T11:33:27Z) - Asynchronous Optimisation for Event-based Visual Odometry [53.59879499700895]
Event cameras open up new possibilities for robotic perception due to their low latency and high dynamic range.
We focus on event-based visual odometry (VO)
We propose an asynchronous structure-from-motion optimisation back-end.
arXiv Detail & Related papers (2022-03-02T11:28:47Z) - Learning Monocular Dense Depth from Events [53.078665310545745]
Event cameras produce brightness changes in the form of a stream of asynchronous events instead of intensity frames.
Recent learning-based approaches have been applied to event-based data, such as monocular depth prediction.
We propose a recurrent architecture to solve this task and show significant improvement over standard feed-forward methods.
arXiv Detail & Related papers (2020-10-16T12:36:23Z) - Tracking Objects as Points [83.9217787335878]
We present a simultaneous detection and tracking algorithm that is simpler, faster, and more accurate than the state of the art.
Our tracker, CenterTrack, applies a detection model to a pair of images and detections from the prior frame.
CenterTrack is simple, online (no peeking into the future), and real-time.
arXiv Detail & Related papers (2020-04-02T17:58:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.