Sports-QA: A Large-Scale Video Question Answering Benchmark for Complex
and Professional Sports
- URL: http://arxiv.org/abs/2401.01505v3
- Date: Wed, 14 Feb 2024 23:58:27 GMT
- Title: Sports-QA: A Large-Scale Video Question Answering Benchmark for Complex
and Professional Sports
- Authors: Haopeng Li, Andong Deng, Qiuhong Ke, Jun Liu, Hossein Rahmani, Yulan
Guo, Bernt Schiele, Chen Chen
- Abstract summary: We introduce the first dataset, named Sports-QA, specifically designed for the sports VideoQA task.
Sports-QA dataset includes various types of questions, such as descriptions, chronologies, causalities, and counterfactual conditions.
We propose a new Auto-Focus Transformer (AFT) capable of automatically focusing on particular scales of temporal information for question answering.
- Score: 90.79212954022218
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reasoning over sports videos for question answering is an important task with
numerous applications, such as player training and information retrieval.
However, this task has not been explored due to the lack of relevant datasets
and the challenging nature it presents. Most datasets for video question
answering (VideoQA) focus mainly on general and coarse-grained understanding of
daily-life videos, which is not applicable to sports scenarios requiring
professional action understanding and fine-grained motion analysis. In this
paper, we introduce the first dataset, named Sports-QA, specifically designed
for the sports VideoQA task. The Sports-QA dataset includes various types of
questions, such as descriptions, chronologies, causalities, and counterfactual
conditions, covering multiple sports. Furthermore, to address the
characteristics of the sports VideoQA task, we propose a new Auto-Focus
Transformer (AFT) capable of automatically focusing on particular scales of
temporal information for question answering. We conduct extensive experiments
on Sports-QA, including baseline studies and the evaluation of different
methods. The results demonstrate that our AFT achieves state-of-the-art
performance.
Related papers
- VQA$^2$:Visual Question Answering for Video Quality Assessment [76.81110038738699]
Video Quality Assessment originally focused on quantitative video quality scoring.
It is now evolving towards more comprehensive visual quality understanding tasks.
We introduce the first visual question answering instruction dataset entirely focuses on video quality assessment.
We conduct extensive experiments on both video quality scoring and video quality understanding tasks.
arXiv Detail & Related papers (2024-11-06T09:39:52Z) - Sports Intelligence: Assessing the Sports Understanding Capabilities of Language Models through Question Answering from Text to Video [5.885902974241053]
Reasoning over complex sports scenarios has posed significant challenges to current NLP technologies.
Our evaluation spans from simple queries on basic rules and historical facts to complex, context-specific reasoning.
We propose a new benchmark based on a comprehensive overview of existing sports datasets and provided extensive error analysis.
arXiv Detail & Related papers (2024-06-21T05:57:50Z) - FunQA: Towards Surprising Video Comprehension [64.58663825184958]
We introduce FunQA, a challenging video question-answering dataset.
FunQA covers three previously unexplored types of surprising videos: HumorQA, CreativeQA, and MagicQA.
In total, the FunQA benchmark consists of 312K free-text QA pairs derived from 4.3K video clips.
arXiv Detail & Related papers (2023-06-26T17:59:55Z) - TG-VQA: Ternary Game of Video Question Answering [33.180788803602084]
Video question answering aims at answering a question about the video content by reasoning the alignment semantics within them.
In this work, we innovatively resort to game theory, which can simulate complicated relationships among multiple players with specific interaction strategies.
Specifically, we carefully design a VideoQA-specific interaction strategy to tailor the characteristics of VideoQA, which can mathematically generate the fine-grained visual-linguistic alignment label without label-intensive efforts.
arXiv Detail & Related papers (2023-05-17T08:42:53Z) - A Survey on Video Action Recognition in Sports: Datasets, Methods and
Applications [60.3327085463545]
We present a survey on video action recognition for sports analytics.
We introduce more than ten types of sports, including team sports, such as football, basketball, volleyball, hockey and individual sports, such as figure skating, gymnastics, table tennis, diving and badminton.
We develop a toolbox using PaddlePaddle, which supports football, basketball, table tennis and figure skating action recognition.
arXiv Detail & Related papers (2022-06-02T13:19:36Z) - Video Question Answering: Datasets, Algorithms and Challenges [99.9179674610955]
Video Question Answering (VideoQA) aims to answer natural language questions according to the given videos.
This paper provides a clear taxonomy and comprehensive analyses to VideoQA, focusing on the datasets, algorithms, and unique challenges.
arXiv Detail & Related papers (2022-03-02T16:34:09Z) - Sports Video: Fine-Grained Action Detection and Classification of Table
Tennis Strokes from Videos for MediaEval 2021 [0.0]
This task tackles fine-grained action detection and classification from videos.
The focus is on recordings of table tennis games.
This work aims at creating tools for sports coaches and players in order to analyze sports performance.
arXiv Detail & Related papers (2021-12-16T10:17:59Z) - NExT-QA:Next Phase of Question-Answering to Explaining Temporal Actions [80.60423934589515]
We introduce NExT-QA, a rigorously designed video question answering (VideoQA) benchmark.
We set up multi-choice and open-ended QA tasks targeting causal action reasoning, temporal action reasoning, and common scene comprehension.
We find that top-performing methods excel at shallow scene descriptions but are weak in causal and temporal action reasoning.
arXiv Detail & Related papers (2021-05-18T04:56:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.