Understanding Complexity in VideoQA via Visual Program Generation
- URL: http://arxiv.org/abs/2505.13429v1
- Date: Mon, 19 May 2025 17:55:14 GMT
- Title: Understanding Complexity in VideoQA via Visual Program Generation
- Authors: Cristobal Eyzaguirre, Igor Vasiljevic, Achal Dave, Jiajun Wu, Rares Andrei Ambrus, Thomas Kollar, Juan Carlos Niebles, Pavel Tokmakov,
- Abstract summary: We propose a data-driven approach to analyzing query complexity in Video Question Answering (VideoQA)<n>We experimentally show that humans struggle to predict which questions are difficult for machine learning models.<n>We extend it to automatically generate complex questions, constructing a new benchmark that is 1.9 times harder than the popular NExT-QA.
- Score: 31.207902042321006
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a data-driven approach to analyzing query complexity in Video Question Answering (VideoQA). Previous efforts in benchmark design have relied on human expertise to design challenging questions, yet we experimentally show that humans struggle to predict which questions are difficult for machine learning models. Our automatic approach leverages recent advances in code generation for visual question answering, using the complexity of generated code as a proxy for question difficulty. We demonstrate that this measure correlates significantly better with model performance than human estimates. To operationalize this insight, we propose an algorithm for estimating question complexity from code. It identifies fine-grained primitives that correlate with the hardest questions for any given set of models, making it easy to scale to new approaches in the future. Finally, to further illustrate the utility of our method, we extend it to automatically generate complex questions, constructing a new benchmark that is 1.9 times harder than the popular NExT-QA.
Related papers
- Efficient and Scalable Neural Symbolic Search for Knowledge Graph Complex Query Answering [50.1887329564127]
We propose an efficient and scalable symbolic search framework for complex queries.<n>Our framework reduces the computational load of symbolic methods by 90% while maintaining nearly the same performance.
arXiv Detail & Related papers (2025-05-13T01:24:09Z) - DEXTER: A Benchmark for open-domain Complex Question Answering using LLMs [3.24692739098077]
Open-domain complex Question Answering (QA) is a difficult task with challenges in evidence retrieval and reasoning.
We evaluate state-of-the-art pre-trained dense and sparse retrieval models in an open-domain setting.
We observe that late interaction models and surprisingly lexical models like BM25 perform well compared to other pre-trained dense retrieval models.
arXiv Detail & Related papers (2024-06-24T22:09:50Z) - Answering from Sure to Uncertain: Uncertainty-Aware Curriculum Learning
for Video Question Answering [63.12469700986452]
We introduce the concept of uncertainty-aware curriculum learning (CL)
Here, uncertainty serves as the guiding principle for dynamically adjusting the difficulty.
In practice, we seamlessly integrate the VideoQA model into our framework and conduct comprehensive experiments.
arXiv Detail & Related papers (2024-01-03T02:29:34Z) - A Lightweight Method to Generate Unanswerable Questions in English [18.323248259867356]
We examine a simpler data augmentation method for unanswerable question generation in English.
We perform antonym and entity swaps on answerable questions.
Compared to the prior state-of-the-art, data generated with our training-free and lightweight strategy results in better models.
arXiv Detail & Related papers (2023-10-30T10:14:52Z) - An Empirical Comparison of LM-based Question and Answer Generation
Methods [79.31199020420827]
Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context.
In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning.
Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches.
arXiv Detail & Related papers (2023-05-26T14:59:53Z) - Successive Prompting for Decomposing Complex Questions [50.00659445976735]
Recent works leverage the capabilities of large language models (LMs) to perform complex question answering in a few-shot setting.
We introduce Successive Prompting'', where we iteratively break down a complex task into a simple task, solve it, and then repeat the process until we get the final solution.
Our best model (with successive prompting) achieves an improvement of 5% absolute F1 on a few-shot version of the DROP dataset.
arXiv Detail & Related papers (2022-12-08T06:03:38Z) - Simple or Complex? Complexity-Controllable Question Generation with Soft
Templates and Deep Mixture of Experts Model [15.411214563867548]
We propose an end-to-end neural complexity-controllable question generation model, which incorporates a mixture of experts (MoE) as the selector of soft templates.
Our method introduces a novel, cross-domain complexity estimator to assess the complexity of a question.
The experimental results on two benchmark QA datasets demonstrate that our QG model is superior to state-of-the-art methods in both automatic and manual evaluation.
arXiv Detail & Related papers (2021-10-13T08:16:52Z) - Asking Questions Like Educational Experts: Automatically Generating
Question-Answer Pairs on Real-World Examination Data [10.353009081072992]
This paper addresses the question-answer pair generation task on the real-world examination data, and proposes a new unified framework on RACE.
We propose a multi-agent communication model to generate and optimize the question and keyphrases iteratively, and then apply the generated question and keyphrases to guide the generation of answers.
Experimental results show that our model makes great breakthroughs in the question-answer pair generation task.
arXiv Detail & Related papers (2021-09-11T04:10:57Z) - Retrieve, Program, Repeat: Complex Knowledge Base Question Answering via
Alternate Meta-learning [56.771557756836906]
We present a novel method that automatically learns a retrieval model alternately with the programmer from weak supervision.
Our system leads to state-of-the-art performance on a large-scale task for complex question answering over knowledge bases.
arXiv Detail & Related papers (2020-10-29T18:28:16Z) - Understanding Unnatural Questions Improves Reasoning over Text [54.235828149899625]
Complex question answering (CQA) over raw text is a challenging task.
Learning an effective CQA model requires large amounts of human-annotated data.
We address the challenge of learning a high-quality programmer (parser) by projecting natural human-generated questions into unnatural machine-generated questions.
arXiv Detail & Related papers (2020-10-19T10:22:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.