Jersey Number Recognition using Keyframe Identification from
Low-Resolution Broadcast Videos
- URL: http://arxiv.org/abs/2309.06285v1
- Date: Tue, 12 Sep 2023 14:43:50 GMT
- Title: Jersey Number Recognition using Keyframe Identification from
Low-Resolution Broadcast Videos
- Authors: Bavesh Balaji, Jerrin Bright, Harish Prakash, Yuhao Chen, David A
Clausi and John Zelek
- Abstract summary: Player identification is a crucial component in vision-driven soccer analytics, enabling various tasks such as player assessment, in-game analysis, and broadcast evaluations.
Previous methods have shown success in image data but struggle with real-world video data, where jersey numbers are not visible in most frames.
We propose a robust downstream identification module that extracts frames containing essential high-level information about the jersey number.
- Score: 7.776923607006088
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Player identification is a crucial component in vision-driven soccer
analytics, enabling various downstream tasks such as player assessment, in-game
analysis, and broadcast production. However, automatically detecting jersey
numbers from player tracklets in videos presents challenges due to motion blur,
low resolution, distortions, and occlusions. Existing methods, utilizing
Spatial Transformer Networks, CNNs, and Vision Transformers, have shown success
in image data but struggle with real-world video data, where jersey numbers are
not visible in most of the frames. Hence, identifying frames that contain the
jersey number is a key sub-problem to tackle. To address these issues, we
propose a robust keyframe identification module that extracts frames containing
essential high-level information about the jersey number. A spatio-temporal
network is then employed to model spatial and temporal context and predict the
probabilities of jersey numbers in the video. Additionally, we adopt a
multi-task loss function to predict the probability distribution of each digit
separately. Extensive evaluations on the SoccerNet dataset demonstrate that
incorporating our proposed keyframe identification module results in a
significant 37.81% and 37.70% increase in the accuracies of 2 different test
sets with domain gaps. These results highlight the effectiveness and importance
of our approach in tackling the challenges of automatic jersey number detection
in sports videos.
Related papers
- A General Framework for Jersey Number Recognition in Sports Video [5.985204759362746]
Jersey number recognition is an important task in sports video analysis, partly due to its importance for long-term player tracking.
Here we introduce a novel public jersey number recognition dataset for hockey and study how scene text recognition methods can be adapted to this problem.
We demonstrate high performance on image- and tracklet-level tasks, achieving 91.4% accuracy for hockey images and 87.4% for soccer tracklets.
arXiv Detail & Related papers (2024-05-22T18:08:26Z) - Domain-Guided Masked Autoencoders for Unique Player Identification [62.87054782745536]
Masked autoencoders (MAEs) have emerged as a superior alternative to conventional feature extractors.
Motivated by human vision, we devise a novel domain-guided masking policy for MAEs termed d-MAE.
We conduct experiments on three large-scale sports datasets.
arXiv Detail & Related papers (2024-03-17T20:14:57Z) - A Graph-Based Method for Soccer Action Spotting Using Unsupervised
Player Classification [75.93186954061943]
Action spotting involves understanding the dynamics of the game, the complexity of events, and the variation of video sequences.
In this work, we focus on the former by (a) identifying and representing the players, referees, and goalkeepers as nodes in a graph, and by (b) modeling their temporal interactions as sequences of graphs.
For the player identification task, our method obtains an overall performance of 57.83% average-mAP by combining it with other modalities.
arXiv Detail & Related papers (2022-11-22T15:23:53Z) - Anomaly Detection in Aerial Videos with Transformers [49.011385492802674]
We create a new dataset, named DroneAnomaly, for anomaly detection in aerial videos.
There are 87,488 color video frames (51,635 for training and 35,853 for testing) with the size of $640 times 640$ at 30 frames per second.
We present a new baseline model, ANomaly Detection with Transformers (ANDT), which treats consecutive video frames as a sequence of tubelets.
arXiv Detail & Related papers (2022-09-25T21:24:18Z) - Automated player identification and indexing using two-stage deep
learning network [0.23610495849936355]
We propose a deep learning-based player tracking system to automatically track players and index their participation per play in American football games.
It is a two-stage network design to highlight areas of interest and identify jersey number information with high accuracy.
We demonstrate the effectiveness and reliability of player tracking system by analyzing the qualitative and quantitative results on football videos.
arXiv Detail & Related papers (2022-04-26T02:59:03Z) - SoccerNet-Tracking: Multiple Object Tracking Dataset and Benchmark in
Soccer Videos [62.686484228479095]
We propose a novel dataset for multiple object tracking composed of 200 sequences of 30s each.
The dataset is fully annotated with bounding boxes and tracklet IDs.
Our analysis shows that multiple player, referee and ball tracking in soccer videos is far from being solved.
arXiv Detail & Related papers (2022-04-14T12:22:12Z) - Knock, knock. Who's there? -- Identifying football player jersey numbers
with synthetic data [0.0]
We present a novel approach for jersey number identification in a small, highly imbalanced dataset from the Seattle Seahawks practice videos.
Our results indicate that simple models can achieve an acceptable performance on the jersey number detection task and that synthetic data can improve the performance dramatically.
arXiv Detail & Related papers (2022-03-01T20:44:34Z) - HighlightMe: Detecting Highlights from Human-Centric Videos [62.265410865423]
We present a domain- and user-preference-agnostic approach to detect highlightable excerpts from human-centric videos.
We use an autoencoder network equipped with spatial-temporal graph convolutions to detect human activities and interactions.
We observe a 4-12% improvement in the mean average precision of matching the human-annotated highlights over state-of-the-art methods.
arXiv Detail & Related papers (2021-10-05T01:18:15Z) - Player Identification in Hockey Broadcast Videos [18.616544581429835]
We present a deep convolutional neural network approach to solve the problem of hockey player identification in NHL broadcast.
We employ a secondary 1-dimensional convolutional neural network as a late score-level fusion method to classify the output of the ResNet+LSTM network.
This achieves an overall player identification accuracy score over 87% on the test split of our new dataset.
arXiv Detail & Related papers (2020-09-05T01:30:15Z) - Dense-Caption Matching and Frame-Selection Gating for Temporal
Localization in VideoQA [96.10612095576333]
We propose a video question answering model which effectively integrates multi-modal input sources and finds the temporally relevant information to answer questions.
Our model is also comprised of dual-level attention (word/object and frame level), multi-head self-cross-integration for different sources (video and dense captions), and which pass more relevant information to gates.
We evaluate our model on the challenging TVQA dataset, where each of our model components provides significant gains, and our overall model outperforms the state-of-the-art by a large margin.
arXiv Detail & Related papers (2020-05-13T16:35:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.