Information Fusion: Scaling Subspace-Driven Approaches
- URL: http://arxiv.org/abs/2204.12035v1
- Date: Tue, 26 Apr 2022 02:16:01 GMT
- Title: Information Fusion: Scaling Subspace-Driven Approaches
- Authors: Sally Ghanem, and Hamid Krim
- Abstract summary: We seek to exploit the deep structure of multi-modal data to robustly exploit the group subspace distribution of the information using the Convolutional Neural Network (CNN) formalism.
Referred to as deep Multimodal Robust Group Subspace Clustering (DRoGSuRe), this approach is compared against the independently developed state-of-the-art approach named Deep Multimodal Subspace Clustering (DMSC)
- Score: 16.85310886805588
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we seek to exploit the deep structure of multi-modal data to
robustly exploit the group subspace distribution of the information using the
Convolutional Neural Network (CNN) formalism. Upon unfolding the set of
subspaces constituting each data modality, and learning their corresponding
encoders, an optimized integration of the generated inherent information is
carried out to yield a characterization of various classes. Referred to as deep
Multimodal Robust Group Subspace Clustering (DRoGSuRe), this approach is
compared against the independently developed state-of-the-art approach named
Deep Multimodal Subspace Clustering (DMSC). Experiments on different multimodal
datasets show that our approach is competitive and more robust in the presence
of noise.
Related papers
- Deep Modularity Networks with Diversity--Preserving Regularization [4.659251704980846]
We propose Deep Modularity Networks with Diversity-Preserving Regularization (DMoN-DPR), which introduces three novel regularization terms: distance-based for inter-cluster separation, variance-based for intra-cluster diversity, and entropy-based for balanced assignments.
Our method enhances clustering performance on benchmark datasets, achieving significant improvements in Normalized Mutual Information (NMI), and F1 scores.
These results demonstrate the effectiveness of incorporating diversity-preserving regularizations in creating meaningful and interpretable clusters, especially in feature-rich datasets.
arXiv Detail & Related papers (2025-01-23T08:05:59Z) - Structure-guided Deep Multi-View Clustering [13.593229506936682]
Deep multi-view clustering seeks to utilize the abundant information from multiple views to improve clustering performance.
Most of the existing clustering methods often neglect to fully mine multi-view structural information.
We propose a structure-guided deep multi-view clustering model to explore the distribution of multi-view data.
arXiv Detail & Related papers (2025-01-17T12:42:30Z) - Preserving Modality Structure Improves Multi-Modal Learning [64.10085674834252]
Self-supervised learning on large-scale multi-modal datasets allows learning semantically meaningful embeddings without relying on human annotations.
These methods often struggle to generalize well on out-of-domain data as they ignore the semantic structure present in modality-specific embeddings.
We propose a novel Semantic-Structure-Preserving Consistency approach to improve generalizability by preserving the modality-specific relationships in the joint embedding space.
arXiv Detail & Related papers (2023-08-24T20:46:48Z) - Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications [90.6849884683226]
We study the challenge of interaction quantification in a semi-supervised setting with only labeled unimodal data.
Using a precise information-theoretic definition of interactions, our key contribution is the derivation of lower and upper bounds.
We show how these theoretical results can be used to estimate multimodal model performance, guide data collection, and select appropriate multimodal models for various tasks.
arXiv Detail & Related papers (2023-06-07T15:44:53Z) - Align and Attend: Multimodal Summarization with Dual Contrastive Losses [57.83012574678091]
The goal of multimodal summarization is to extract the most important information from different modalities to form output summaries.
Existing methods fail to leverage the temporal correspondence between different modalities and ignore the intrinsic correlation between different samples.
We introduce Align and Attend Multimodal Summarization (A2Summ), a unified multimodal transformer-based model which can effectively align and attend the multimodal input.
arXiv Detail & Related papers (2023-03-13T17:01:42Z) - Subspace-Contrastive Multi-View Clustering [0.0]
We propose a novel Subspace-Contrastive Multi-View Clustering (SCMC) approach.
We employ view-specific auto-encoders to map the original multi-view data into compact features perceiving its nonlinear structures.
To demonstrate the effectiveness of the proposed model, we conduct a large number of comparative experiments on eight challenge datasets.
arXiv Detail & Related papers (2022-10-13T07:19:37Z) - Consistency and Diversity induced Human Motion Segmentation [231.36289425663702]
We propose a novel Consistency and Diversity induced human Motion (CDMS) algorithm.
Our model factorizes the source and target data into distinct multi-layer feature spaces.
A multi-mutual learning strategy is carried out to reduce the domain gap between the source and target data.
arXiv Detail & Related papers (2022-02-10T06:23:56Z) - A Multiscale Environment for Learning by Diffusion [9.619814126465206]
We introduce the Multiscale Environment for Learning by Diffusion (MELD) data model.
We show that the MELD data model precisely captures latent multiscale structure in data and facilitates its analysis.
To efficiently learn the multiscale structure observed in many real datasets, we introduce the Multiscale Learning by Unsupervised Diffusion (M-LUND) clustering algorithm.
arXiv Detail & Related papers (2021-01-31T17:46:19Z) - Deep Multimodal Fusion by Channel Exchanging [87.40768169300898]
This paper proposes a parameter-free multimodal fusion framework that dynamically exchanges channels between sub-networks of different modalities.
The validity of such exchanging process is also guaranteed by sharing convolutional filters yet keeping separate BN layers across modalities, which, as an add-on benefit, allows our multimodal architecture to be almost as compact as a unimodal network.
arXiv Detail & Related papers (2020-11-10T09:53:20Z) - Robust Group Subspace Recovery: A New Approach for Multi-Modality Data
Fusion [18.202825916298437]
We propose a novel multi-modal data fusion approach based on group sparsity.
The proposed approach exploits the structural dependencies between the different modalities data to cluster the associated target objects.
The resulting UoS structure is employed to classify newly observed data points, highlighting the abstraction capacity of the proposed method.
arXiv Detail & Related papers (2020-06-18T16:31:31Z) - Agglomerative Neural Networks for Multi-view Clustering [109.55325971050154]
We propose the agglomerative analysis to approximate the optimal consensus view.
We present Agglomerative Neural Network (ANN) based on Constrained Laplacian Rank to cluster multi-view data directly.
Our evaluations against several state-of-the-art multi-view clustering approaches on four popular datasets show the promising view-consensus analysis ability of ANN.
arXiv Detail & Related papers (2020-05-12T05:39:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.