Spatio-Temporal Dynamic Inference Network for Group Activity Recognition
- URL: http://arxiv.org/abs/2108.11743v1
- Date: Thu, 26 Aug 2021 12:40:20 GMT
- Title: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition
- Authors: Hangjie Yuan, Dong Ni, Mang Wang
- Abstract summary: Group activity aims to understand the activity performed by a group of people in order to solve it.
Previous methods are limited in reasoning on a predefined graph, which ignores the person-specific context.
We propose Dynamic Inference Network (DIN), which composes of Dynamic Relation (DR) module and Dynamic Walk (DW) module.
- Score: 7.007702816885332
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Group activity recognition aims to understand the activity performed by a
group of people. In order to solve it, modeling complex spatio-temporal
interactions is the key. Previous methods are limited in reasoning on a
predefined graph, which ignores the inherent person-specific interaction
context. Moreover, they adopt inference schemes that are computationally
expensive and easily result in the over-smoothing problem. In this paper, we
manage to achieve spatio-temporal person-specific inferences by proposing
Dynamic Inference Network (DIN), which composes of Dynamic Relation (DR) module
and Dynamic Walk (DW) module. We firstly propose to initialize interaction
fields on a primary spatio-temporal graph. Within each interaction field, we
apply DR to predict the relation matrix and DW to predict the dynamic walk
offsets in a joint-processing manner, thus forming a person-specific
interaction graph. By updating features on the specific graph, a person can
possess a global-level interaction field with a local initialization.
Experiments indicate both modules' effectiveness. Moreover, DIN achieves
significant improvement compared to previous state-of-the-art methods on two
popular datasets under the same setting, while costing much less computation
overhead of the reasoning module.
Related papers
- Relation Learning and Aggregate-attention for Multi-person Motion Prediction [13.052342503276936]
Multi-person motion prediction considers not just the skeleton structures or human trajectories but also the interactions between others.
Previous methods often overlook that the joints relations within an individual (intra-relation) and interactions among groups (inter-relation) are distinct types of representations.
We introduce a new collaborative framework for multi-person motion prediction that explicitly modeling these relations.
arXiv Detail & Related papers (2024-11-06T07:48:30Z) - Interaction Event Forecasting in Multi-Relational Recursive HyperGraphs: A Temporal Point Process Approach [12.142292322071299]
This work addresses the problem of forecasting higher-order interaction events in multi-relational recursive hypergraphs.
The proposed model, textitRelational Recursive Hyperedge Temporal Point Process (RRHyperTPP), uses an encoder that learns a dynamic node representation based on the historical interaction patterns.
We have experimentally shown that our models perform better than previous state-of-the-art methods for interaction forecasting.
arXiv Detail & Related papers (2024-04-27T15:46:54Z) - TimeGraphs: Graph-based Temporal Reasoning [64.18083371645956]
TimeGraphs is a novel approach that characterizes dynamic interactions as a hierarchical temporal graph.
Our approach models the interactions using a compact graph-based representation, enabling adaptive reasoning across diverse time scales.
We evaluate TimeGraphs on multiple datasets with complex, dynamic agent interactions, including a football simulator, the Resistance game, and the MOMA human activity dataset.
arXiv Detail & Related papers (2024-01-06T06:26:49Z) - Temporal Aggregation and Propagation Graph Neural Networks for Dynamic
Representation [67.26422477327179]
Temporal graphs exhibit dynamic interactions between nodes over continuous time.
We propose a novel method of temporal graph convolution with the whole neighborhood.
Our proposed TAP-GNN outperforms existing temporal graph methods by a large margin in terms of both predictive performance and online inference latency.
arXiv Detail & Related papers (2023-04-15T08:17:18Z) - Spatial Parsing and Dynamic Temporal Pooling networks for Human-Object
Interaction detection [30.896749712316222]
This paper introduces the Spatial Parsing and Dynamic Temporal Pooling (SPDTP) network, which takes the entire video as atemporal graph with human and object nodes as input.
We achieve state-of-the-art performance on CAD-120 and Something-Else dataset.
arXiv Detail & Related papers (2022-06-07T07:26:06Z) - Dynamic Relation Discovery and Utilization in Multi-Entity Time Series
Forecasting [92.32415130188046]
In many real-world scenarios, there could exist crucial yet implicit relation between entities.
We propose an attentional multi-graph neural network with automatic graph learning (A2GNN) in this work.
arXiv Detail & Related papers (2022-02-18T11:37:04Z) - Dynamic Representation Learning with Temporal Point Processes for
Higher-Order Interaction Forecasting [8.680676599607123]
This paper proposes a temporal point process model for hyperedge prediction to address these problems.
As far as our knowledge, this is the first work that uses the temporal point process to forecast hyperedges in dynamic networks.
arXiv Detail & Related papers (2021-12-19T14:24:37Z) - ConTIG: Continuous Representation Learning on Temporal Interaction
Graphs [32.25218861788686]
ConTIG is a continuous representation method that captures the continuous dynamic evolution of node embedding trajectories.
Our model exploit three-fold factors in dynamic networks which include latest interaction, neighbor features and inherent characteristics.
Experiments results demonstrate the superiority of ConTIG on temporal link prediction, temporal node recommendation and dynamic node classification tasks.
arXiv Detail & Related papers (2021-09-27T12:11:24Z) - Learning Dual Dynamic Representations on Time-Sliced User-Item
Interaction Graphs for Sequential Recommendation [62.30552176649873]
We devise a novel Dynamic Representation Learning model for Sequential Recommendation (DRL-SRe)
To better model the user-item interactions for characterizing the dynamics from both sides, the proposed model builds a global user-item interaction graph for each time slice.
To enable the model to capture fine-grained temporal information, we propose an auxiliary temporal prediction task over consecutive time slices.
arXiv Detail & Related papers (2021-09-24T07:44:27Z) - Modeling long-term interactions to enhance action recognition [81.09859029964323]
We propose a new approach to under-stand actions in egocentric videos that exploits the semantics of object interactions at both frame and temporal levels.
We use a region-based approach that takes as input a primary region roughly corresponding to the user hands and a set of secondary regions potentially corresponding to the interacting objects.
The proposed approach outperforms the state-of-the-art in terms of action recognition on standard benchmarks.
arXiv Detail & Related papers (2021-04-23T10:08:15Z) - Cascaded Human-Object Interaction Recognition [175.60439054047043]
We introduce a cascade architecture for a multi-stage, coarse-to-fine HOI understanding.
At each stage, an instance localization network progressively refines HOI proposals and feeds them into an interaction recognition network.
With our carefully-designed human-centric relation features, these two modules work collaboratively towards effective interaction understanding.
arXiv Detail & Related papers (2020-03-09T17:05:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.