Feature Combination Meets Attention: Baidu Soccer Embeddings and
Transformer based Temporal Detection
- URL: http://arxiv.org/abs/2106.14447v1
- Date: Mon, 28 Jun 2021 08:00:21 GMT
- Title: Feature Combination Meets Attention: Baidu Soccer Embeddings and
Transformer based Temporal Detection
- Authors: Xin Zhou, Le Kang, Zhiyu Cheng, Bo He, Jingyu Xin
- Abstract summary: We present a two-stage paradigm to detect what and when events happen in soccer broadcast videos.
Specifically, we fine-tune multiple action recognition models on soccer data to extract high-level semantic features.
This approach achieved the state-of-the-art performance in both two tasks, i.e., action spotting and replay grounding, in the SoccerNet-v2 Challenge.
- Score: 3.7709686875144337
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With rapidly evolving internet technologies and emerging tools, sports
related videos generated online are increasing at an unprecedentedly fast pace.
To automate sports video editing/highlight generation process, a key task is to
precisely recognize and locate the events in the long untrimmed videos. In this
tech report, we present a two-stage paradigm to detect what and when events
happen in soccer broadcast videos. Specifically, we fine-tune multiple action
recognition models on soccer data to extract high-level semantic features, and
design a transformer based temporal detection module to locate the target
events. This approach achieved the state-of-the-art performance in both two
tasks, i.e., action spotting and replay grounding, in the SoccerNet-v2
Challenge, under CVPR 2021 ActivityNet workshop. Our soccer embedding features
are released at https://github.com/baidu-research/vidpress-sports. By sharing
these features with the broader community, we hope to accelerate the research
into soccer video understanding.
Related papers
- Deep learning for action spotting in association football videos [64.10841325879996]
The SoccerNet initiative organizes yearly challenges, during which participants from all around the world compete to achieve state-of-the-art performances.
This paper traces the history of action spotting in sports, from the creation of the task back in 2018, to the role it plays today in research and the sports industry.
arXiv Detail & Related papers (2024-10-02T07:56:15Z) - Investigating Event-Based Cameras for Video Frame Interpolation in Sports [59.755469098797406]
We present a first investigation of event-based Video Frame Interpolation (VFI) models for generating sports slow-motion videos.
Particularly, we design and implement a bi-camera recording setup, including an RGB and an event-based camera to capture sports videos, to temporally align and spatially register both cameras.
Our experimental validation demonstrates that TimeLens, an off-the-shelf event-based VFI model, can effectively generate slow-motion footage for sports videos.
arXiv Detail & Related papers (2024-07-02T15:39:08Z) - Towards Active Learning for Action Spotting in Association Football
Videos [59.84375958757395]
Analyzing football videos is challenging and requires identifying subtle and diverse-temporal patterns.
Current algorithms face significant challenges when learning from limited annotated data.
We propose an active learning framework that selects the most informative video samples to be annotated next.
arXiv Detail & Related papers (2023-04-09T11:50:41Z) - A Graph-Based Method for Soccer Action Spotting Using Unsupervised
Player Classification [75.93186954061943]
Action spotting involves understanding the dynamics of the game, the complexity of events, and the variation of video sequences.
In this work, we focus on the former by (a) identifying and representing the players, referees, and goalkeepers as nodes in a graph, and by (b) modeling their temporal interactions as sequences of graphs.
For the player identification task, our method obtains an overall performance of 57.83% average-mAP by combining it with other modalities.
arXiv Detail & Related papers (2022-11-22T15:23:53Z) - A Multi-stage deep architecture for summary generation of soccer videos [11.41978608521222]
We propose a method to generate the summary of a soccer match exploiting both the audio and the event metadata.
The results show that our method can detect the actions of the match, identify which of these actions should belong to the summary and then propose multiple candidate summaries.
arXiv Detail & Related papers (2022-05-02T07:26:35Z) - MMSys'22 Grand Challenge on AI-based Video Production for Soccer [2.14475390920102]
This challenge aims to assist the automation of such a production pipeline using AI.
In particular, we focus on the enhancement operations that take place after an event has been detected.
arXiv Detail & Related papers (2022-02-02T13:53:42Z) - Smart Director: An Event-Driven Directing System for Live Broadcasting [110.30675947733167]
Smart Director aims at mimicking the typical human-in-the-loop broadcasting process to automatically create near-professional broadcasting programs in real-time.
Our system is the first end-to-end automated directing system for multi-camera sports broadcasting.
arXiv Detail & Related papers (2022-01-11T16:14:41Z) - SoccerNet-v2: A Dataset and Benchmarks for Holistic Understanding of
Broadcast Soccer Videos [71.72665910128975]
SoccerNet-v2 is a novel large-scale corpus of manual annotations for the SoccerNet video dataset.
We release around 300k annotations within SoccerNet's 500 untrimmed broadcast soccer videos.
We extend current tasks in the realm of soccer to include action spotting, camera shot segmentation with boundary detection.
arXiv Detail & Related papers (2020-11-26T16:10:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.