A Universal Model for Cross Modality Mapping by Relational Reasoning
- URL: http://arxiv.org/abs/2102.13360v1
- Date: Fri, 26 Feb 2021 08:56:24 GMT
- Title: A Universal Model for Cross Modality Mapping by Relational Reasoning
- Authors: Zun Li, Congyan Lang, Liqian Liang, Tao Wang, Songhe Feng, Jun Wu, and
Yidong Li
- Abstract summary: Cross modality mapping has attracted growing attention in the computer vision community.
We propose a GCN-based Reasoning Network (RR-Net) in which inter and intra relations are efficiently computed.
Experiments on three example tasks, i.e., image classification, social recommendation and sound recognition, clearly demonstrate the superiority and universality of our proposed model.
- Score: 29.081989993636338
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the aim of matching a pair of instances from two different modalities,
cross modality mapping has attracted growing attention in the computer vision
community. Existing methods usually formulate the mapping function as the
similarity measure between the pair of instance features, which are embedded to
a common space. However, we observe that the relationships among the instances
within a single modality (intra relations) and those between the pair of
heterogeneous instances (inter relations) are insufficiently explored in
previous approaches. Motivated by this, we redefine the mapping function with
relational reasoning via graph modeling, and further propose a GCN-based
Relational Reasoning Network (RR-Net) in which inter and intra relations are
efficiently computed to universally resolve the cross modality mapping problem.
Concretely, we first construct two kinds of graph, i.e., Intra Graph and Inter
Graph, to respectively model intra relations and inter relations. Then RR-Net
updates all the node features and edge features in an iterative manner for
learning intra and inter relations simultaneously. Last, RR-Net outputs the
probabilities over the edges which link a pair of heterogeneous instances to
estimate the mapping results. Extensive experiments on three example tasks,
i.e., image classification, social recommendation and sound recognition,
clearly demonstrate the superiority and universality of our proposed model.
Related papers
- Multi-Relational Graph Neural Network for Out-of-Domain Link Prediction [12.475382123139024]
We introduce a novel Graph Neural Network model, named GOOD, to tackle the out-of-domain generalization problem.
GOOD can effectively generalize predictions out of known relationship types and achieve state-of-the-art results.
arXiv Detail & Related papers (2024-03-17T18:08:22Z) - Learning Complete Topology-Aware Correlations Between Relations for Inductive Link Prediction [121.65152276851619]
We show that semantic correlations between relations are inherently edge-level and entity-independent.
We propose a novel subgraph-based method, namely TACO, to model Topology-Aware COrrelations between relations.
To further exploit the potential of RCN, we propose Complete Common Neighbor induced subgraph.
arXiv Detail & Related papers (2023-09-20T08:11:58Z) - Modeling Instance Interactions for Joint Information Extraction with
Neural High-Order Conditional Random Field [39.055053720433435]
We introduce a joint IE framework (CRFIE) that formulates joint IE as a high-order Conditional Random Field.
Specifically, we design binary factors and ternary factors to directly model interactions between not only a pair of instances but also triplets.
We incorporate a high-order neural decoder that is unfolded from a mean-field variational inference method.
arXiv Detail & Related papers (2022-12-17T18:45:23Z) - Relation Matters: Foreground-aware Graph-based Relational Reasoning for
Domain Adaptive Object Detection [81.07378219410182]
We propose a new and general framework for DomainD, named Foreground-aware Graph-based Reasoning (FGRR)
FGRR incorporates graph structures into the detection pipeline to explicitly model the intra- and inter-domain foreground object relations.
Empirical results demonstrate that the proposed FGRR exceeds the state-of-the-art on four DomainD benchmarks.
arXiv Detail & Related papers (2022-06-06T05:12:48Z) - Modelling Neighbor Relation in Joint Space-Time Graph for Video
Correspondence Learning [53.74240452117145]
This paper presents a self-supervised method for learning reliable visual correspondence from unlabeled videos.
We formulate the correspondence as finding paths in a joint space-time graph, where nodes are grid patches sampled from frames, and are linked by two types of edges.
Our learned representation outperforms the state-of-the-art self-supervised methods on a variety of visual tasks.
arXiv Detail & Related papers (2021-09-28T05:40:01Z) - Homogeneous and Heterogeneous Relational Graph for Visible-infrared
Person Re-identification [20.30508026932434]
Visible-infrared person re-identification (VI Re-ID) aims to match person images between the visible and infrared modalities.
Existing VI Re-ID methods mainly focus on extracting homogeneous structural relationships from a single image.
In this paper, we model the homogenous structural relationship by a modality-specific graph within individual modality.
We then mine the heterogeneous structural correlation in these two modality-specific graphs.
arXiv Detail & Related papers (2021-09-18T02:51:16Z) - Explicit Pairwise Factorized Graph Neural Network for Semi-Supervised
Node Classification [59.06717774425588]
We propose the Explicit Pairwise Factorized Graph Neural Network (EPFGNN), which models the whole graph as a partially observed Markov Random Field.
It contains explicit pairwise factors to model output-output relations and uses a GNN backbone to model input-output relations.
We conduct experiments on various datasets, which shows that our model can effectively improve the performance for semi-supervised node classification on graphs.
arXiv Detail & Related papers (2021-07-27T19:47:53Z) - Bidirectional Graph Reasoning Network for Panoptic Segmentation [126.06251745669107]
We introduce a Bidirectional Graph Reasoning Network (BGRNet) to mine the intra-modular and intermodular relations within and between foreground things and background stuff classes.
BGRNet first constructs image-specific graphs in both instance and semantic segmentation branches that enable flexible reasoning at the proposal level and class level.
arXiv Detail & Related papers (2020-04-14T02:32:10Z) - Cascaded Human-Object Interaction Recognition [175.60439054047043]
We introduce a cascade architecture for a multi-stage, coarse-to-fine HOI understanding.
At each stage, an instance localization network progressively refines HOI proposals and feeds them into an interaction recognition network.
With our carefully-designed human-centric relation features, these two modules work collaboratively towards effective interaction understanding.
arXiv Detail & Related papers (2020-03-09T17:05:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.