Spotting Temporally Precise, Fine-Grained Events in Video
- URL: http://arxiv.org/abs/2207.10213v1
- Date: Wed, 20 Jul 2022 22:15:07 GMT
- Title: Spotting Temporally Precise, Fine-Grained Events in Video
- Authors: James Hong, Haotian Zhang, Micha\"el Gharbi, Matthew Fisher, Kayvon
Fatahalian
- Abstract summary: We introduce the task of spotting temporally precise, fine-grained events in video.
Models must reason globally about the full-time scale of actions and locally to identify subtle frame-to-frame appearance and motion differences.
We propose E2E-Spot, a compact, end-to-end model that performs well on the precise spotting task and can be trained quickly on a single GPU.
- Score: 23.731838969934206
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce the task of spotting temporally precise, fine-grained events in
video (detecting the precise moment in time events occur). Precise spotting
requires models to reason globally about the full-time scale of actions and
locally to identify subtle frame-to-frame appearance and motion differences
that identify events during these actions. Surprisingly, we find that top
performing solutions to prior video understanding tasks such as action
detection and segmentation do not simultaneously meet both requirements. In
response, we propose E2E-Spot, a compact, end-to-end model that performs well
on the precise spotting task and can be trained quickly on a single GPU. We
demonstrate that E2E-Spot significantly outperforms recent baselines adapted
from the video action detection, segmentation, and spotting literature to the
precise spotting task. Finally, we contribute new annotations and splits to
several fine-grained sports action datasets to make these datasets suitable for
future work on precise spotting.
Related papers
- Unifying Global and Local Scene Entities Modelling for Precise Action Spotting [5.474440128682843]
We introduce a novel approach that analyzes and models scene entities using an adaptive attention mechanism.
Our model has demonstrated outstanding performance, securing the 1st place in the SoccerNet-v2 Action Spotting, FineDiving, and FineGym challenge.
arXiv Detail & Related papers (2024-04-15T17:24:57Z) - Towards Active Learning for Action Spotting in Association Football
Videos [59.84375958757395]
Analyzing football videos is challenging and requires identifying subtle and diverse-temporal patterns.
Current algorithms face significant challenges when learning from limited annotated data.
We propose an active learning framework that selects the most informative video samples to be annotated next.
arXiv Detail & Related papers (2023-04-09T11:50:41Z) - A Graph-Based Method for Soccer Action Spotting Using Unsupervised
Player Classification [75.93186954061943]
Action spotting involves understanding the dynamics of the game, the complexity of events, and the variation of video sequences.
In this work, we focus on the former by (a) identifying and representing the players, referees, and goalkeepers as nodes in a graph, and by (b) modeling their temporal interactions as sequences of graphs.
For the player identification task, our method obtains an overall performance of 57.83% average-mAP by combining it with other modalities.
arXiv Detail & Related papers (2022-11-22T15:23:53Z) - Temporally Precise Action Spotting in Soccer Videos Using Dense
Detection Anchors [1.6114012813668934]
We present a model for temporally precise action spotting in videos, which uses a dense set of detection anchors, predicting a detection confidence and corresponding fine-grained temporal displacement for each anchor.
We achieve a new state-of-the-art on SoccerNet-v2, the largest soccer video dataset of its kind, with marked improvements in temporal localization.
arXiv Detail & Related papers (2022-05-20T22:14:02Z) - Video Action Detection: Analysing Limitations and Challenges [70.01260415234127]
We analyze existing datasets on video action detection and discuss their limitations.
We perform a biasness study which analyzes a key property differentiating videos from static images: the temporal aspect.
Such extreme experiments show existence of biases which have managed to creep into existing methods inspite of careful modeling.
arXiv Detail & Related papers (2022-04-17T00:42:14Z) - E^2TAD: An Energy-Efficient Tracking-based Action Detector [78.90585878925545]
This paper presents a tracking-based solution to accurately and efficiently localize predefined key actions.
It won first place in the UAV-Video Track of 2021 Low-Power Computer Vision Challenge (LPCVC)
arXiv Detail & Related papers (2022-04-09T07:52:11Z) - SegTAD: Precise Temporal Action Detection via Semantic Segmentation [65.01826091117746]
We formulate the task of temporal action detection in a novel perspective of semantic segmentation.
Owing to the 1-dimensional property of TAD, we are able to convert the coarse-grained detection annotations to fine-grained semantic segmentation annotations for free.
We propose an end-to-end framework SegTAD composed of a 1D semantic segmentation network (1D-SSN) and a proposal detection network (PDN)
arXiv Detail & Related papers (2022-03-03T06:52:13Z) - RMS-Net: Regression and Masking for Soccer Event Spotting [52.742046866220484]
We devise a lightweight and modular network for action spotting, which can simultaneously predict the event label and its temporal offset.
When tested on the SoccerNet dataset and using standard features, our full proposal exceeds the current state of the art by 3 Average-mAP points.
arXiv Detail & Related papers (2021-02-15T16:04:18Z) - Joint Detection and Tracking in Videos with Identification Features [36.55599286568541]
We propose the first joint optimization of detection, tracking and re-identification features for videos.
Our method reaches the state-of-the-art on MOT, it ranks 1st in the UA-DETRAC'18 tracking challenge among online trackers, and 3rd overall.
arXiv Detail & Related papers (2020-05-21T21:06:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.