Deep-Learning-Based Computer Vision Approach For The Segmentation Of
Ball Deliveries And Tracking In Cricket
- URL: http://arxiv.org/abs/2211.12009v1
- Date: Tue, 22 Nov 2022 04:55:58 GMT
- Title: Deep-Learning-Based Computer Vision Approach For The Segmentation Of
Ball Deliveries And Tracking In Cricket
- Authors: Kumail Abbas, Muhammad Saeed, M. Imad Khan, Khandakar Ahmed, Hua Wang
- Abstract summary: This paper presents an approach to segment and extract video shots in which only the ball is being delivered.
Object detection models are applied to reach a high level of accuracy in terms of correctly extracting video shots.
Ball tracking in these video shots is also done using a separate RetinaNet model as a sample of the usefulness of the proposed dataset.
- Score: 4.021584094339975
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There has been a significant increase in the adoption of technology in
cricket recently. This trend has created the problem of duplicate work being
done in similar computer vision-based research works. Our research tries to
solve one of these problems by segmenting ball deliveries in a cricket
broadcast using deep learning models, MobileNet and YOLO, thus enabling
researchers to use our work as a dataset for their research. The output from
our research can be used by cricket coaches and players to analyze ball
deliveries which are played during the match. This paper presents an approach
to segment and extract video shots in which only the ball is being delivered.
The video shots are a series of continuous frames that make up the whole scene
of the video. Object detection models are applied to reach a high level of
accuracy in terms of correctly extracting video shots. The proof of concept for
building large datasets of video shots for ball deliveries is proposed which
paves the way for further processing on those shots for the extraction of
semantics. Ball tracking in these video shots is also done using a separate
RetinaNet model as a sample of the usefulness of the proposed dataset. The
position on the cricket pitch where the ball lands is also extracted by
tracking the ball along the y-axis. The video shot is then classified as a
full-pitched, good-length or short-pitched delivery.
Related papers
- SoccerNet Game State Reconstruction: End-to-End Athlete Tracking and Identification on a Minimap [102.5232204867158]
We formalize the task of Game State Reconstruction and introduce SoccerNet-GSR, a novel Game State Reconstruction dataset focusing on football videos.
SoccerNet-GSR is composed of 200 video sequences of 30 seconds, annotated with 9.37 million line points for pitch localization and camera calibration.
Our experiments show that GSR is a challenging novel task, which opens the field for future research.
arXiv Detail & Related papers (2024-04-17T12:53:45Z) - Dense Video Object Captioning from Disjoint Supervision [77.47084982558101]
We propose a new task and model for dense video object captioning.
This task unifies spatial and temporal localization in video.
We show how our model improves upon a number of strong baselines for this new task.
arXiv Detail & Related papers (2023-06-20T17:57:23Z) - Event Detection in Football using Graph Convolutional Networks [0.0]
We show how to model the players and the ball in each frame of the video sequence as a graph.
We present the results for graph convolutional layers and pooling methods that can be used to model the temporal context present around each action.
arXiv Detail & Related papers (2023-01-24T14:52:54Z) - Sports Video Analysis on Large-Scale Data [10.24207108909385]
This paper investigates the modeling of automated machine description on sports video.
We propose a novel large-scale NBA dataset for Sports Video Analysis (NSVA) with a focus on captioning.
arXiv Detail & Related papers (2022-08-09T16:59:24Z) - SoccerNet-Tracking: Multiple Object Tracking Dataset and Benchmark in
Soccer Videos [62.686484228479095]
We propose a novel dataset for multiple object tracking composed of 200 sequences of 30s each.
The dataset is fully annotated with bounding boxes and tracklet IDs.
Our analysis shows that multiple player, referee and ball tracking in soccer videos is far from being solved.
arXiv Detail & Related papers (2022-04-14T12:22:12Z) - Ball 3D localization from a single calibrated image [1.2891210250935146]
We propose to address the task on a single image by estimating ball diameter in pixels and use the knowledge of real ball diameter in meters.
This approach is suitable for any game situation where the ball is (even partly) visible.
validations on 3 basketball datasets reveals that our model gives remarkable predictions on ball 3D localization.
arXiv Detail & Related papers (2022-03-30T19:38:14Z) - Camera Calibration and Player Localization in SoccerNet-v2 and
Investigation of their Representations for Action Spotting [61.92132798351982]
We distill a powerful commercial calibration tool in a recent neural network architecture on the large-scale SoccerNet dataset.
We leverage it to provide 3 ways of representing the calibration results along with player localization.
We exploit those representations within the current best architecture for the action spotting task of SoccerNet-v2.
arXiv Detail & Related papers (2021-04-19T14:21:05Z) - SoccerNet-v2: A Dataset and Benchmarks for Holistic Understanding of
Broadcast Soccer Videos [71.72665910128975]
SoccerNet-v2 is a novel large-scale corpus of manual annotations for the SoccerNet video dataset.
We release around 300k annotations within SoccerNet's 500 untrimmed broadcast soccer videos.
We extend current tasks in the realm of soccer to include action spotting, camera shot segmentation with boundary detection.
arXiv Detail & Related papers (2020-11-26T16:10:16Z) - A Unified Framework for Shot Type Classification Based on Subject
Centric Lens [89.26211834443558]
We propose a learning framework for shot type recognition using Subject Guidance Network (SGNet)
SGNet separates the subject and background of a shot into two streams, serving as separate guidance maps for scale and movement type classification respectively.
We build a large-scale dataset MovieShots, which contains 46K shots from 7K movie trailers with annotations of their scale and movement types.
arXiv Detail & Related papers (2020-08-08T15:49:40Z) - Learning to Play Cup-and-Ball with Noisy Camera Observations [2.6931502677545947]
We present a learning model based control strategy for the cup-and-ball game.
A Universal Robots UR5e manipulator arm learns to catch a ball in one of the cups on a Kendama.
arXiv Detail & Related papers (2020-07-19T02:22:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.