Related papers: Action Spotting and Precise Event Detection in Sports: Datasets, Methods, and Challenges

Action Spotting and Precise Event Detection in Sports: Datasets, Methods, and Challenges

URL: http://arxiv.org/abs/2505.03991v2
Date: Thu, 19 Jun 2025 05:05:54 GMT
Title: Action Spotting and Precise Event Detection in Sports: Datasets, Methods, and Challenges
Authors: Hao Xu, Arbind Agrahari Baniya, Sam Well, Mohamed Reda Bouadjenek, Richard Dazeley, Sunil Aryal,
Abstract summary: Video event detection is central to modern sports analytics, enabling automated understanding of key moments for performance evaluation, content creation, and tactical feedback.<n>While deep learning has significantly advanced tasks, existing surveys often overlook the fine-grained temporal demands and domain-specific challenges posed by sports.<n>This survey first provides a clear conceptual distinction between TAL, AS, and PES, then introduces a methods-based taxonomy covering recent deep learning approaches for AS and PES.<n>We outline open challenges and future directions toward more temporally precise, generalizable, and practical event spotting in sports video analysis.
Score: 5.747955930615445
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Video event detection is central to modern sports analytics, enabling automated understanding of key moments for performance evaluation, content creation, and tactical feedback. While deep learning has significantly advanced tasks like Temporal Action Localization (TAL), Action Spotting (AS), and Precise Event Spotting (PES), existing surveys often overlook the fine-grained temporal demands and domain-specific challenges posed by sports. This survey first provides a clear conceptual distinction between TAL, AS, and PES, then introduces a methods-based taxonomy covering recent deep learning approaches for AS and PES, including feature-based pipelines, end-to-end architectures, and multimodal strategies. We further review benchmark datasets and evaluation protocols, identifying critical limitations such as reliance on broadcast-quality footage and lenient multi-label metrics that hinder real-world deployment. Finally, we outline open challenges and future directions toward more temporally precise, generalizable, and practical event spotting in sports video analysis.

Related papers

Velocity Completion Task and Method for Event-based Player Positional Data in Soccer [0.9002260638342727]
Event-based positional data lacks continuous temporal information needed to calculate crucial properties such as velocity.<n>We propose a new method to simultaneously complete the velocity of all agents using only the event-based positional data from team sports.
arXiv Detail & Related papers (2025-05-22T04:01:49Z)
Grounding-MD: Grounded Video-language Pre-training for Open-World Moment Detection [67.70328796057466]
Grounding-MD is an innovative, grounded video-language pre-training framework tailored for open-world moment detection.<n>Our framework incorporates an arbitrary number of open-ended natural language queries through a structured prompt mechanism.<n>Grounding-MD demonstrates exceptional semantic representation learning capabilities, effectively handling diverse and complex query conditions.
arXiv Detail & Related papers (2025-04-20T09:54:25Z)
OpenSTARLab: Open Approach for Spatio-Temporal Agent Data Analysis in Soccer [0.9207076627649226]
Sports analytics has become more professional and sophisticated, driven by the growing availability of detailed performance data.<n>In soccer, the effective utilization of event and tracking data is fundamental for capturing and analyzing the dynamics of the game.<n>Here we propose OpenSTARLab, an open-source framework designed to democratizetemporal agent data analysis in sports.
arXiv Detail & Related papers (2025-02-05T00:14:18Z)
About Time: Advances, Challenges, and Outlooks of Action Understanding [57.76390141287026]
This survey comprehensively reviews advances in uni- and multi-modal action understanding across a range of tasks.<n>We focus on prevalent challenges, overview widely adopted datasets, and survey seminal works with an emphasis on recent advances.
arXiv Detail & Related papers (2024-11-22T18:09:27Z)
WearableMil: An End-to-End Framework for Military Activity Recognition and Performance Monitoring [7.130450173185638]
This paper introduces an end-to-end framework for preprocessing, analyzing, and recognizing activities from wearable data in military training contexts.<n>We use data from 135 soldiers wearing textitGarmin--55 smartwatches over six months with over 15 million minutes.<n>Our framework addresses missing data through physiologically-informed methods, reducing unknown sleep states from 40.38% to 3.66%.
arXiv Detail & Related papers (2024-10-07T19:35:15Z)
Grounding Partially-Defined Events in Multimodal Data [61.0063273919745]
We introduce a multimodal formulation for partially-defined events and cast the extraction of these events as a three-stage span retrieval task. We propose a benchmark for this task, MultiVENT-G, that consists of 14.5 hours of densely annotated current event videos and 1,168 text documents, containing 22.8K labeled event-centric entities. Results illustrate the challenges that abstract event understanding poses and demonstrates promise in event-centric video-language systems.
arXiv Detail & Related papers (2024-10-07T17:59:48Z)
Deep learning for action spotting in association football videos [64.10841325879996]
The SoccerNet initiative organizes yearly challenges, during which participants from all around the world compete to achieve state-of-the-art performances. This paper traces the history of action spotting in sports, from the creation of the task back in 2018, to the role it plays today in research and the sports industry.
arXiv Detail & Related papers (2024-10-02T07:56:15Z)
A Comprehensive Methodological Survey of Human Activity Recognition Across Divers Data Modalities [2.916558661202724]
Human Activity Recognition (HAR) systems aim to understand human behaviour and assign a label to each action. HAR can leverage various data modalities, such as RGB images and video, skeleton, depth, infrared, point cloud, event stream, audio, acceleration, and radar signals. This paper presents a comprehensive survey of the latest advancements in HAR from 2014 to 2024.
arXiv Detail & Related papers (2024-09-15T10:04:44Z)
OSL-ActionSpotting: A Unified Library for Action Spotting in Sports Videos [56.393522913188704]
We introduce OSL-ActionSpotting, a Python library that unifies different action spotting algorithms to streamline research and applications in sports video analytics. We successfully integrated three cornerstone action spotting methods into OSL-ActionSpotting, achieving performance metrics that match those of the original, disparates.
arXiv Detail & Related papers (2024-07-01T13:17:37Z)
Enhancing HOI Detection with Contextual Cues from Large Vision-Language Models [56.257840490146]
ConCue is a novel approach for improving visual feature extraction in HOI detection. We develop a transformer-based feature extraction module with a multi-tower architecture that integrates contextual cues into both instance and interaction detectors.
arXiv Detail & Related papers (2023-11-26T09:11:32Z)
Event-based Simultaneous Localization and Mapping: A Comprehensive Survey [52.73728442921428]
Review of event-based vSLAM algorithms that exploit the benefits of asynchronous and irregular event streams for localization and mapping tasks. Paper categorizes event-based vSLAM methods into four main categories: feature-based, direct, motion-compensation, and deep learning methods.
arXiv Detail & Related papers (2023-04-19T16:21:14Z)
Towards Active Learning for Action Spotting in Association Football Videos [59.84375958757395]
Analyzing football videos is challenging and requires identifying subtle and diverse-temporal patterns. Current algorithms face significant challenges when learning from limited annotated data. We propose an active learning framework that selects the most informative video samples to be annotated next.
arXiv Detail & Related papers (2023-04-09T11:50:41Z)
Reliable Shot Identification for Complex Event Detection via Visual-Semantic Embedding [72.9370352430965]
We propose a visual-semantic guided loss method for event detection in videos. Motivated by curriculum learning, we introduce a negative elastic regularization term to start training the classifier with instances of high reliability. An alternative optimization algorithm is developed to solve the proposed challenging non-net regularization problem.
arXiv Detail & Related papers (2021-10-12T11:46:56Z)
Toyota Smarthome Untrimmed: Real-World Untrimmed Videos for Activity Detection [6.682959425576476]
We introduce a new untrimmed daily-living dataset that features several real-world challenges: Toyota Smarthome Untrimmed. The dataset contains dense annotations including elementary, composite activities and activities involving interactions with objects. We show that current state-of-the-art methods fail to achieve satisfactory performance on the TSU dataset. We propose a new baseline method for activity detection to tackle the novel challenges provided by our dataset.
arXiv Detail & Related papers (2020-10-28T13:47:16Z)
ZSTAD: Zero-Shot Temporal Activity Detection [107.63759089583382]
We propose a novel task setting called zero-shot temporal activity detection (ZSTAD), where activities that have never been seen in training can still be detected. We design an end-to-end deep network based on R-C3D as the architecture for this solution. Experiments on both the THUMOS14 and the Charades datasets show promising performance in terms of detecting unseen activities.
arXiv Detail & Related papers (2020-03-12T02:40:36Z)
Unsupervised and Interpretable Domain Adaptation to Rapidly Filter Tweets for Emergency Services [18.57009530004948]
We present a novel method to classify relevant tweets during an ongoing crisis using the publicly available dataset of TREC incident streams. We use dedicated attention layers for each task to provide model interpretability; critical for real-word applications. We show a practical implication of our work by providing a use-case for the COVID-19 pandemic.
arXiv Detail & Related papers (2020-03-04T06:40:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.