In Defense of Structural Symbolic Representation for Video
Event-Relation Prediction
- URL: http://arxiv.org/abs/2301.03410v2
- Date: Wed, 12 Apr 2023 15:19:16 GMT
- Title: In Defense of Structural Symbolic Representation for Video
Event-Relation Prediction
- Authors: Andrew Lu, Xudong Lin, Yulei Niu, Shih-Fu Chang
- Abstract summary: We conduct an empirical analysis to answer the following questions: 1) why SSR-based method failed; 2) how to understand the evaluation setting of video event relation prediction properly; and 3) how to uncover the potential of SSR-based methods.
We propose to further contextualize the SSR-based model to an Event-Sequence Model and equip it with more factual knowledge through a simple yet effective way of reformulating external visual commonsense knowledge bases into an event-relation prediction pretraining dataset.
- Score: 44.528350052251334
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding event relationships in videos requires a model to understand
the underlying structures of events (i.e. the event type, the associated
argument roles, and corresponding entities) and factual knowledge for
reasoning. Structural symbolic representation (SSR) based methods directly take
event types and associated argument roles/entities as inputs to perform
reasoning. However, the state-of-the-art video event-relation prediction system
shows the necessity of using continuous feature vectors from input videos;
existing methods based solely on SSR inputs fail completely, even when given
oracle event types and argument roles. In this paper, we conduct an extensive
empirical analysis to answer the following questions: 1) why SSR-based method
failed; 2) how to understand the evaluation setting of video event relation
prediction properly; 3) how to uncover the potential of SSR-based methods. We
first identify suboptimal training settings as causing the failure of previous
SSR-based video event prediction models. Then through qualitative and
quantitative analysis, we show how evaluation that takes only video as inputs
is currently unfeasible, as well as the reliance on oracle event information to
obtain an accurate evaluation. Based on these findings, we propose to further
contextualize the SSR-based model to an Event-Sequence Model and equip it with
more factual knowledge through a simple yet effective way of reformulating
external visual commonsense knowledge bases into an event-relation prediction
pretraining dataset. The resultant new state-of-the-art model eventually
establishes a 25% Macro-accuracy performance boost.
Related papers
- Event-aware Video Corpus Moment Retrieval [79.48249428428802]
Video Corpus Moment Retrieval (VCMR) is a practical video retrieval task focused on identifying a specific moment within a vast corpus of untrimmed videos.
Existing methods for VCMR typically rely on frame-aware video retrieval, calculating similarities between the query and video frames to rank videos.
We propose EventFormer, a model that explicitly utilizes events within videos as fundamental units for video retrieval.
arXiv Detail & Related papers (2024-02-21T06:55:20Z) - Semantic-aware Dynamic Retrospective-Prospective Reasoning for
Event-level Video Question Answering [14.659023742381777]
Event-Level Video Question Answering (EVQA) requires complex reasoning across video events to provide optimal answers.
We propose a semantic-aware dynamic retrospective-prospective reasoning approach for video-based question answering.
Our proposed approach achieves superior performance compared to previous state-of-the-art models.
arXiv Detail & Related papers (2023-05-14T03:57:11Z) - Event Knowledge Incorporation with Posterior Regularization for
Event-Centric Question Answering [32.03893317439898]
We propose a strategy to incorporate event knowledge extracted from event trigger annotations via posterior regularization.
In particular, we define event-related knowledge constraints based on the event trigger annotations in the QA datasets.
We conduct experiments on two event-centric QA datasets, TORQUE and ESTER.
arXiv Detail & Related papers (2023-05-08T07:45:12Z) - Event-Centric Question Answering via Contrastive Learning and Invertible
Event Transformation [29.60817278635999]
We propose a novel QA model with contrastive learning and invertible event transformation, called TranCLR.
Our proposed model utilizes an invertible transformation matrix to project semantic vectors of events into a common event embedding space, trained with contrastive learning, and thus naturally inject event semantic knowledge into mainstream QA pipelines.
arXiv Detail & Related papers (2022-10-24T01:15:06Z) - Accessing and Interpreting OPC UA Event Traces based on Semantic Process
Descriptions [69.9674326582747]
This paper proposes an approach to access a production systems' event data based on the event data's context.
The approach extracts filtered event logs from a database system by combining: 1) a semantic model of a production system's hierarchical structure, 2) a formalized process description and 3) an OPC UA information model.
arXiv Detail & Related papers (2022-07-25T15:13:44Z) - Improve Event Extraction via Self-Training with Gradient Guidance [10.618929821822892]
We propose a Self-Training with Feedback (STF) framework to overcome the main factor that hinders the progress of event extraction.
STF consists of (1) a base event extraction model trained on existing event annotations and then applied to large-scale unlabeled corpora to predict new event mentions as pseudo training samples, and (2) a novel scoring model that takes in each new predicted event trigger, an argument, its argument role, as well as their paths in the AMR graph to estimate a compatibility score.
Experimental results on three benchmark datasets, including ACE05-E, ACE05-E+, and ERE
arXiv Detail & Related papers (2022-05-25T04:40:17Z) - Event Data Association via Robust Model Fitting for Event-based Object Tracking [66.05728523166755]
We propose a novel Event Data Association (called EDA) approach to explicitly address the event association and fusion problem.
The proposed EDA seeks for event trajectories that best fit the event data, in order to perform unifying data association and information fusion.
The experimental results show the effectiveness of EDA under challenging scenarios, such as high speed, motion blur, and high dynamic range conditions.
arXiv Detail & Related papers (2021-10-25T13:56:00Z) - Deconfounded Video Moment Retrieval with Causal Intervention [80.90604360072831]
We tackle the task of video moment retrieval (VMR), which aims to localize a specific moment in a video according to a textual query.
Existing methods primarily model the matching relationship between query and moment by complex cross-modal interactions.
We propose a causality-inspired VMR framework that builds structural causal model to capture the true effect of query and video content on the prediction.
arXiv Detail & Related papers (2021-06-03T01:33:26Z) - Online Learning Probabilistic Event Calculus Theories in Answer Set
Programming [70.06301658267125]
Event Recognition (CER) systems detect occurrences in streaming time-stamped datasets using predefined event patterns.
We present a system based on Answer Set Programming (ASP), capable of probabilistic reasoning with complex event patterns in the form of rules weighted in the Event Calculus.
Our results demonstrate the superiority of our novel approach, both terms efficiency and predictive.
arXiv Detail & Related papers (2021-03-31T23:16:29Z) - Self-supervised pre-training and contrastive representation learning for
multiple-choice video QA [39.78914328623504]
Video Question Answering (Video QA) requires fine-grained understanding of both video and language modalities to answer the given questions.
We propose novel training schemes for multiple-choice video question answering with a self-supervised pre-training stage and a supervised contrastive learning in the main stage as an auxiliary learning.
We evaluate our proposed model on highly competitive benchmark datasets related to multiple-choice video QA: TVQA, TVQA+, and DramaQA.
arXiv Detail & Related papers (2020-09-17T03:37:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.