Geometric Multimodal Deep Learning with Multi-Scaled Graph Wavelet
Convolutional Network
- URL: http://arxiv.org/abs/2111.13361v1
- Date: Fri, 26 Nov 2021 08:41:51 GMT
- Title: Geometric Multimodal Deep Learning with Multi-Scaled Graph Wavelet
Convolutional Network
- Authors: Maysam Behmanesh, Peyman Adibi, Mohammad Saeed Ehsani, Jocelyn
Chanussot
- Abstract summary: Multimodal data provide information of a natural phenomenon by integrating data from various domains with very different statistical properties.
Capturing the intra-modality and cross-modality information of multimodal data is the essential capability of multimodal learning methods.
Generalizing deep learning methods to the non-Euclidean domains is an emerging research field.
- Score: 21.06669693699965
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multimodal data provide complementary information of a natural phenomenon by
integrating data from various domains with very different statistical
properties. Capturing the intra-modality and cross-modality information of
multimodal data is the essential capability of multimodal learning methods. The
geometry-aware data analysis approaches provide these capabilities by
implicitly representing data in various modalities based on their geometric
underlying structures. Also, in many applications, data are explicitly defined
on an intrinsic geometric structure. Generalizing deep learning methods to the
non-Euclidean domains is an emerging research field, which has recently been
investigated in many studies. Most of those popular methods are developed for
unimodal data. In this paper, a multimodal multi-scaled graph wavelet
convolutional network (M-GWCN) is proposed as an end-to-end network. M-GWCN
simultaneously finds intra-modality representation by applying the multiscale
graph wavelet transform to provide helpful localization properties in the graph
domain of each modality, and cross-modality representation by learning
permutations that encode correlations among various modalities. M-GWCN is not
limited to either the homogeneous modalities with the same number of data, or
any prior knowledge indicating correspondences between modalities. Several
semi-supervised node classification experiments have been conducted on three
popular unimodal explicit graph-based datasets and five multimodal implicit
ones. The experimental results indicate the superiority and effectiveness of
the proposed methods compared with both spectral graph domain convolutional
neural networks and state-of-the-art multimodal methods.
Related papers
- Enhancing Deep Learning Models through Tensorization: A Comprehensive
Survey and Framework [0.0]
This paper explores the steps involved in multidimensional data sources, various multiway analysis methods employed, and the benefits of these approaches.
A small example of Blind Source Separation (BSS) is presented comparing 2-dimensional algorithms and a multiway algorithm in Python.
Results indicate that multiway analysis is more expressive.
arXiv Detail & Related papers (2023-09-05T17:56:22Z) - Generalized Product-of-Experts for Learning Multimodal Representations
in Noisy Environments [18.14974353615421]
We propose a novel method for multimodal representation learning in a noisy environment via the generalized product of experts technique.
In the proposed method, we train a separate network for each modality to assess the credibility of information coming from that modality.
We attain state-of-the-art performance on two challenging benchmarks: multimodal 3D hand-pose estimation and multimodal surgical video segmentation.
arXiv Detail & Related papers (2022-11-07T14:27:38Z) - Convolutional Learning on Multigraphs [153.20329791008095]
We develop convolutional information processing on multigraphs and introduce convolutional multigraph neural networks (MGNNs)
To capture the complex dynamics of information diffusion within and across each of the multigraph's classes of edges, we formalize a convolutional signal processing model.
We develop a multigraph learning architecture, including a sampling procedure to reduce computational complexity.
The introduced architecture is applied towards optimal wireless resource allocation and a hate speech localization task, offering improved performance over traditional graph neural networks.
arXiv Detail & Related papers (2022-09-23T00:33:04Z) - Geometric multimodal representation learning [13.159512679346687]
Multimodal learning methods fuse multiple data modalities while leveraging cross-modal dependencies to address this challenge.
We put forward an algorithmic blueprint for multimodal graph learning based on this categorization.
This effort can pave the way for standardizing the design of sophisticated multimodal architectures for highly complex real-world problems.
arXiv Detail & Related papers (2022-09-07T16:59:03Z) - Consistency and Diversity induced Human Motion Segmentation [231.36289425663702]
We propose a novel Consistency and Diversity induced human Motion (CDMS) algorithm.
Our model factorizes the source and target data into distinct multi-layer feature spaces.
A multi-mutual learning strategy is carried out to reduce the domain gap between the source and target data.
arXiv Detail & Related papers (2022-02-10T06:23:56Z) - A graph representation based on fluid diffusion model for multimodal
data analysis: theoretical aspects and enhanced community detection [14.601444144225875]
We introduce a novel model for graph definition based on fluid diffusion.
Our method is able to strongly outperform state-of-the-art schemes for community detection in multimodal data analysis.
arXiv Detail & Related papers (2021-12-07T16:30:03Z) - Multiplex Graph Networks for Multimodal Brain Network Analysis [30.195666008281915]
We propose MGNet, a simple and effective multiplex graph convolutional network (GCN) model for multimodal brain network analysis.
We conduct classification task on two challenging real-world datasets (HIV and Bipolar disorder)
arXiv Detail & Related papers (2021-07-31T06:01:29Z) - Manifold Topology Divergence: a Framework for Comparing Data Manifolds [109.0784952256104]
We develop a framework for comparing data manifold, aimed at the evaluation of deep generative models.
Based on the Cross-Barcode, we introduce the Manifold Topology Divergence score (MTop-Divergence)
We demonstrate that the MTop-Divergence accurately detects various degrees of mode-dropping, intra-mode collapse, mode invention, and image disturbance.
arXiv Detail & Related papers (2021-06-08T00:30:43Z) - A Multi-Semantic Metapath Model for Large Scale Heterogeneous Network
Representation Learning [52.83948119677194]
We propose a multi-semantic metapath (MSM) model for large scale heterogeneous representation learning.
Specifically, we generate multi-semantic metapath-based random walks to construct the heterogeneous neighborhood to handle the unbalanced distributions.
We conduct systematical evaluations for the proposed framework on two challenging datasets: Amazon and Alibaba.
arXiv Detail & Related papers (2020-07-19T22:50:20Z) - MS-Net: Multi-Site Network for Improving Prostate Segmentation with
Heterogeneous MRI Data [75.73881040581767]
We propose a novel multi-site network (MS-Net) for improving prostate segmentation by learning robust representations.
Our MS-Net improves the performance across all datasets consistently, and outperforms state-of-the-art methods for multi-site learning.
arXiv Detail & Related papers (2020-02-09T14:11:50Z) - Unpaired Multi-modal Segmentation via Knowledge Distillation [77.39798870702174]
We propose a novel learning scheme for unpaired cross-modality image segmentation.
In our method, we heavily reuse network parameters, by sharing all convolutional kernels across CT and MRI.
We have extensively validated our approach on two multi-class segmentation problems.
arXiv Detail & Related papers (2020-01-06T20:03:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.