Adaptive Interaction Modeling via Graph Operations Search
- URL: http://arxiv.org/abs/2005.02113v1
- Date: Tue, 5 May 2020 13:01:09 GMT
- Title: Adaptive Interaction Modeling via Graph Operations Search
- Authors: Haoxin Li, Wei-Shi Zheng, Yu Tao, Haifeng Hu, Jian-Huang Lai
- Abstract summary: We automate the process of structures design to learn adaptive structures for interaction modeling.
We experimentally demonstrate that our architecture search framework learns to construct adaptive interaction modeling structures.
Our method achieves competitive performance with state-of-the-arts.
- Score: 109.45125932109454
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Interaction modeling is important for video action analysis. Recently,
several works design specific structures to model interactions in videos.
However, their structures are manually designed and non-adaptive, which require
structures design efforts and more importantly could not model interactions
adaptively. In this paper, we automate the process of structures design to
learn adaptive structures for interaction modeling. We propose to search the
network structures with differentiable architecture search mechanism, which
learns to construct adaptive structures for different videos to facilitate
adaptive interaction modeling. To this end, we first design the search space
with several basic graph operations that explicitly capture different relations
in videos. We experimentally demonstrate that our architecture search framework
learns to construct adaptive interaction modeling structures, which provides
more understanding about the relations between the structures and some
interaction characteristics, and also releases the requirement of structures
design efforts. Additionally, we show that the designed basic graph operations
in the search space are able to model different interactions in videos. The
experiments on two interaction datasets show that our method achieves
competitive performance with state-of-the-arts.
Related papers
- Learning Correlation Structures for Vision Transformers [93.22434535223587]
We introduce a new attention mechanism, dubbed structural self-attention (StructSA)
We generate attention maps by recognizing space-time structures of key-query correlations via convolution.
This effectively leverages rich structural patterns in images and videos such as scene layouts, object motion, and inter-object relations.
arXiv Detail & Related papers (2024-04-05T07:13:28Z) - Learning Hierarchical Relational Representations through Relational Convolutions [2.5322020135765464]
We introduce "relational convolutional networks", a neural architecture equipped with computational mechanisms that capture progressively more complex relational features.
A key component of this framework is a novel operation that captures relational patterns in groups of objects by convolving graphlet filters.
We present the motivation and details of the architecture, together with a set of experiments to demonstrate how relational convolutional networks can provide an effective framework for modeling relational tasks that have hierarchical structure.
arXiv Detail & Related papers (2023-10-05T01:22:50Z) - Inferring Local Structure from Pairwise Correlations [0.0]
We show that pairwise correlations provide enough information to recover local relations.
This proves to be successful even though higher order interaction structures are present in our data.
arXiv Detail & Related papers (2023-05-07T22:38:29Z) - Triple-level Model Inferred Collaborative Network Architecture for Video
Deraining [43.06607185181434]
We develop a model-guided triple-level optimization framework to deduce network architecture with cooperating optimization and auto-searching mechanism.
Our model shows significant improvements in fidelity and temporal consistency over the state-of-the-art works.
arXiv Detail & Related papers (2021-11-08T13:09:00Z) - Dynamic Modeling of Hand-Object Interactions via Tactile Sensing [133.52375730875696]
In this work, we employ a high-resolution tactile glove to perform four different interactive activities on a diversified set of objects.
We build our model on a cross-modal learning framework and generate the labels using a visual processing pipeline to supervise the tactile model.
This work takes a step on dynamics modeling in hand-object interactions from dense tactile sensing.
arXiv Detail & Related papers (2021-09-09T16:04:14Z) - Unified Graph Structured Models for Video Understanding [93.72081456202672]
We propose a message passing graph neural network that explicitly models relational-temporal relations.
We show how our method is able to more effectively model relationships between relevant entities in the scene.
arXiv Detail & Related papers (2021-03-29T14:37:35Z) - Coordination Among Neural Modules Through a Shared Global Workspace [78.08062292790109]
In cognitive science, a global workspace architecture has been proposed in which functionally specialized components share information.
We show that capacity limitations have a rational basis in that they encourage specialization and compositionality.
arXiv Detail & Related papers (2021-03-01T18:43:48Z) - S2RMs: Spatially Structured Recurrent Modules [105.0377129434636]
We take a step towards exploiting dynamic structure that are capable of simultaneously exploiting both modular andtemporal structures.
We find our models to be robust to the number of available views and better capable of generalization to novel tasks without additional training.
arXiv Detail & Related papers (2020-07-13T17:44:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.