MRSN: Multi-Relation Support Network for Video Action Detection
- URL: http://arxiv.org/abs/2304.11975v1
- Date: Mon, 24 Apr 2023 10:15:31 GMT
- Title: MRSN: Multi-Relation Support Network for Video Action Detection
- Authors: Yin-Dong Zheng, Guo Chen, Minglei Yuan, Tong Lu
- Abstract summary: Action detection is a challenging video understanding task requiring modeling relations.
We propose a novel network called Multi-temporallation Supportarity Network.
Our experiments demonstrate that modeling relations separately and performing relation-level interactions can achieve state-of-the-art results.
- Score: 15.82531313330869
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Action detection is a challenging video understanding task, requiring
modeling spatio-temporal and interaction relations. Current methods usually
model actor-actor and actor-context relations separately, ignoring their
complementarity and mutual support. To solve this problem, we propose a novel
network called Multi-Relation Support Network (MRSN). In MRSN, Actor-Context
Relation Encoder (ACRE) and Actor-Actor Relation Encoder (AARE) model the
actor-context and actor-actor relation separately. Then Relation Support
Encoder (RSE) computes the supports between the two relations and performs
relation-level interactions. Finally, Relation Consensus Module (RCM) enhances
two relations with the long-term relations from the Long-term Relation Bank
(LRB) and yields a consensus. Our experiments demonstrate that modeling
relations separately and performing relation-level interactions can achieve and
outperformer state-of-the-art results on two challenging video datasets: AVA
and UCF101-24.
Related papers
- CycleACR: Cycle Modeling of Actor-Context Relations for Video Action
Detection [67.90338302559672]
We propose to select actor-related scene context, rather than directly leverage raw video scenario, to improve relation modeling.
We develop a Cycle Actor-Context Relation network (CycleACR) where there is a symmetric graph that models the actor and context relations in a bidirectional form.
Compared to existing designs that focus on C2A-E, our CycleACR introduces A2C-R for a more effective relation modeling.
arXiv Detail & Related papers (2023-03-28T16:40:47Z) - Modelling Multi-relations for Convolutional-based Knowledge Graph
Embedding [0.2752817022620644]
It is considered that such approaches disconnect the semantic connection of multi-relations between an entity pair.
We propose a convolutional and multi-relational learning model, ConvMR.
We show that ConvMR is efficient to deal with less frequent entities.
arXiv Detail & Related papers (2022-10-21T03:43:06Z) - Modeling Multi-Label Action Dependencies for Temporal Action
Localization [53.53490517832068]
Real-world videos contain many complex actions with inherent relationships between action classes.
We propose an attention-based architecture that models these action relationships for the task of temporal action localization in unoccurrence videos.
We show improved performance over state-of-the-art methods on multi-label action localization benchmarks.
arXiv Detail & Related papers (2021-03-04T13:37:28Z) - DCR-Net: A Deep Co-Interactive Relation Network for Joint Dialog Act
Recognition and Sentiment Classification [77.59549450705384]
In dialog system, dialog act recognition and sentiment classification are two correlative tasks.
Most of the existing systems either treat them as separate tasks or just jointly model the two tasks.
We propose a Deep Co-Interactive Relation Network (DCR-Net) to explicitly consider the cross-impact and model the interaction between the two tasks.
arXiv Detail & Related papers (2020-08-16T14:13:32Z) - Actor-Context-Actor Relation Network for Spatio-Temporal Action
Localization [47.61419011906561]
ACAR-Net builds upon a novel High-order Relation Reasoning Operator to enable indirect reasoning fortemporal action localization.
Our method ranks first in the AVA-Kineticsaction localization task of ActivityNet Challenge 2020.
arXiv Detail & Related papers (2020-06-14T18:51:49Z) - Relation of the Relations: A New Paradigm of the Relation Extraction
Problem [52.21210549224131]
We propose a new paradigm of Relation Extraction (RE) that considers as a whole the predictions of all relations in the same context.
We develop a data-driven approach that does not require hand-crafted rules but learns by itself the relation of relations (RoR) using Graph Neural Networks and a relation matrix transformer.
Experiments show that our model outperforms the state-of-the-art approaches by +1.12% on the ACE05 dataset and +2.55% on SemEval 2018 Task 7.2.
arXiv Detail & Related papers (2020-06-05T22:25:27Z) - Cascaded Human-Object Interaction Recognition [175.60439054047043]
We introduce a cascade architecture for a multi-stage, coarse-to-fine HOI understanding.
At each stage, an instance localization network progressively refines HOI proposals and feeds them into an interaction recognition network.
With our carefully-designed human-centric relation features, these two modules work collaboratively towards effective interaction understanding.
arXiv Detail & Related papers (2020-03-09T17:05:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.