Information Fusion: Scaling Subspace-Driven Approaches
- URL: http://arxiv.org/abs/2204.12035v1
- Date: Tue, 26 Apr 2022 02:16:01 GMT
- Title: Information Fusion: Scaling Subspace-Driven Approaches
- Authors: Sally Ghanem, and Hamid Krim
- Abstract summary: We seek to exploit the deep structure of multi-modal data to robustly exploit the group subspace distribution of the information using the Convolutional Neural Network (CNN) formalism.
Referred to as deep Multimodal Robust Group Subspace Clustering (DRoGSuRe), this approach is compared against the independently developed state-of-the-art approach named Deep Multimodal Subspace Clustering (DMSC)
- Score: 16.85310886805588
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we seek to exploit the deep structure of multi-modal data to
robustly exploit the group subspace distribution of the information using the
Convolutional Neural Network (CNN) formalism. Upon unfolding the set of
subspaces constituting each data modality, and learning their corresponding
encoders, an optimized integration of the generated inherent information is
carried out to yield a characterization of various classes. Referred to as deep
Multimodal Robust Group Subspace Clustering (DRoGSuRe), this approach is
compared against the independently developed state-of-the-art approach named
Deep Multimodal Subspace Clustering (DMSC). Experiments on different multimodal
datasets show that our approach is competitive and more robust in the presence
of noise.
Related papers
- FedFusion: Manifold Driven Federated Learning for Multi-satellite and
Multi-modality Fusion [30.909597853659506]
This paper proposes a manifold-driven multi-modality fusion framework, FedFusion, which randomly samples local data on each client to jointly estimate the prominent manifold structure of shallow features of each client.
Considering the physical space limitations of the satellite constellation, we developed a multimodal federated learning module designed specifically for manifold data in a deep latent space.
The proposed framework surpasses existing methods in terms of performance on three multimodal datasets, achieving a classification average accuracy of 94.35$%$ while compressing communication costs by a factor of 4.
arXiv Detail & Related papers (2023-11-16T03:29:19Z) - Preserving Modality Structure Improves Multi-Modal Learning [64.10085674834252]
Self-supervised learning on large-scale multi-modal datasets allows learning semantically meaningful embeddings without relying on human annotations.
These methods often struggle to generalize well on out-of-domain data as they ignore the semantic structure present in modality-specific embeddings.
We propose a novel Semantic-Structure-Preserving Consistency approach to improve generalizability by preserving the modality-specific relationships in the joint embedding space.
arXiv Detail & Related papers (2023-08-24T20:46:48Z) - Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications [90.6849884683226]
We study the challenge of interaction quantification in a semi-supervised setting with only labeled unimodal data.
Using a precise information-theoretic definition of interactions, our key contribution is the derivation of lower and upper bounds.
We show how these theoretical results can be used to estimate multimodal model performance, guide data collection, and select appropriate multimodal models for various tasks.
arXiv Detail & Related papers (2023-06-07T15:44:53Z) - Align and Attend: Multimodal Summarization with Dual Contrastive Losses [57.83012574678091]
The goal of multimodal summarization is to extract the most important information from different modalities to form output summaries.
Existing methods fail to leverage the temporal correspondence between different modalities and ignore the intrinsic correlation between different samples.
We introduce Align and Attend Multimodal Summarization (A2Summ), a unified multimodal transformer-based model which can effectively align and attend the multimodal input.
arXiv Detail & Related papers (2023-03-13T17:01:42Z) - Subspace-Contrastive Multi-View Clustering [0.0]
We propose a novel Subspace-Contrastive Multi-View Clustering (SCMC) approach.
We employ view-specific auto-encoders to map the original multi-view data into compact features perceiving its nonlinear structures.
To demonstrate the effectiveness of the proposed model, we conduct a large number of comparative experiments on eight challenge datasets.
arXiv Detail & Related papers (2022-10-13T07:19:37Z) - Consistency and Diversity induced Human Motion Segmentation [231.36289425663702]
We propose a novel Consistency and Diversity induced human Motion (CDMS) algorithm.
Our model factorizes the source and target data into distinct multi-layer feature spaces.
A multi-mutual learning strategy is carried out to reduce the domain gap between the source and target data.
arXiv Detail & Related papers (2022-02-10T06:23:56Z) - A Multiscale Environment for Learning by Diffusion [9.619814126465206]
We introduce the Multiscale Environment for Learning by Diffusion (MELD) data model.
We show that the MELD data model precisely captures latent multiscale structure in data and facilitates its analysis.
To efficiently learn the multiscale structure observed in many real datasets, we introduce the Multiscale Learning by Unsupervised Diffusion (M-LUND) clustering algorithm.
arXiv Detail & Related papers (2021-01-31T17:46:19Z) - Deep Multimodal Fusion by Channel Exchanging [87.40768169300898]
This paper proposes a parameter-free multimodal fusion framework that dynamically exchanges channels between sub-networks of different modalities.
The validity of such exchanging process is also guaranteed by sharing convolutional filters yet keeping separate BN layers across modalities, which, as an add-on benefit, allows our multimodal architecture to be almost as compact as a unimodal network.
arXiv Detail & Related papers (2020-11-10T09:53:20Z) - Robust Group Subspace Recovery: A New Approach for Multi-Modality Data
Fusion [18.202825916298437]
We propose a novel multi-modal data fusion approach based on group sparsity.
The proposed approach exploits the structural dependencies between the different modalities data to cluster the associated target objects.
The resulting UoS structure is employed to classify newly observed data points, highlighting the abstraction capacity of the proposed method.
arXiv Detail & Related papers (2020-06-18T16:31:31Z) - Survey on Deep Multi-modal Data Analytics: Collaboration, Rivalry and
Fusion [6.225190099424806]
Multi-modal or multi-view data has surged as a major stream for big data, where each modal/view encodes individual property of data objects.
Most of the existing state-of-the-art focused on how to fuse the energy or information from multi-modal spaces to deliver a superior performance.
Deep neural networks have exhibited as a powerful architecture to well capture the nonlinear distribution of high-dimensional multimedia data.
arXiv Detail & Related papers (2020-06-15T06:42:04Z) - Agglomerative Neural Networks for Multi-view Clustering [109.55325971050154]
We propose the agglomerative analysis to approximate the optimal consensus view.
We present Agglomerative Neural Network (ANN) based on Constrained Laplacian Rank to cluster multi-view data directly.
Our evaluations against several state-of-the-art multi-view clustering approaches on four popular datasets show the promising view-consensus analysis ability of ANN.
arXiv Detail & Related papers (2020-05-12T05:39:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.