CustOmics: A versatile deep-learning based strategy for multi-omics
integration
- URL: http://arxiv.org/abs/2209.05485v1
- Date: Mon, 12 Sep 2022 14:20:29 GMT
- Title: CustOmics: A versatile deep-learning based strategy for multi-omics
integration
- Authors: Hakim Benkirane, Yoann Pradat, Stefan Michiels, Paul-Henry Courn\`ede
- Abstract summary: This paper presents a novel strategy to build a customizable autoencoder model that adapts to the dataset used in the case of high-dimensional multi-source integration.
We will assess the impact of integration strategies on the latent representation and combine the best strategies to propose a new method, CustOmics.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in high-throughput sequencing technologies have enabled the
extraction of multiple features that depict patient samples at diverse and
complementary molecular levels. The generation of such data has led to new
challenges in computational biology regarding the integration of
high-dimensional and heterogeneous datasets that capture the interrelationships
between multiple genes and their functions. Thanks to their versatility and
ability to learn synthetic latent representations of complex data, deep
learning methods offer promising perspectives for integrating multi-omics data.
These methods have led to the conception of many original architectures that
are primarily based on autoencoder models. However, due to the difficulty of
the task, the integration strategy is fundamental to take full advantage of the
sources' particularities without losing the global trends. This paper presents
a novel strategy to build a customizable autoencoder model that adapts to the
dataset used in the case of high-dimensional multi-source integration. We will
assess the impact of integration strategies on the latent representation and
combine the best strategies to propose a new method, CustOmics
(https://github.com/HakimBenkirane/CustOmics). We focus here on the integration
of data from multiple omics sources and demonstrate the performance of the
proposed method on test cases for several tasks such as classification and
survival analysis.
Related papers
- Supervised Multi-Modal Fission Learning [19.396207029419813]
Learning from multimodal datasets can leverage complementary information and improve performance in prediction tasks.
We propose a Multi-Modal Fission Learning model that simultaneously identifies globally joint, partially joint, and individual components.
arXiv Detail & Related papers (2024-09-30T17:58:03Z) - Supervised Multiple Kernel Learning approaches for multi-omics data integration [1.3032276477872158]
Multiple kernel learning (MKL) has shown to be a flexible and valid approach to consider the diverse nature of multi-omics inputs.
We provide novel MKL approaches based on different kernel fusion strategies.
Results show that MKL-based models can compete with more complex, state-of-the-art, supervised multi-omics integrative approaches.
arXiv Detail & Related papers (2024-03-27T08:48:16Z) - HEALNet: Multimodal Fusion for Heterogeneous Biomedical Data [10.774128925670183]
This paper presents the Hybrid Early-fusion Attention Learning Network (HEALNet), a flexible multimodal fusion architecture.
We conduct multimodal survival analysis on Whole Slide Images and Multi-omic data on four cancer datasets from The Cancer Genome Atlas (TCGA)
HEALNet achieves state-of-the-art performance compared to other end-to-end trained fusion models.
arXiv Detail & Related papers (2023-11-15T17:06:26Z) - Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers.
We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes.
We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z) - Incomplete Multimodal Learning for Complex Brain Disorders Prediction [65.95783479249745]
We propose a new incomplete multimodal data integration approach that employs transformers and generative adversarial networks.
We apply our new method to predict cognitive degeneration and disease outcomes using the multimodal imaging genetic data from Alzheimer's Disease Neuroimaging Initiative cohort.
arXiv Detail & Related papers (2023-05-25T16:29:16Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - CAFE: Learning to Condense Dataset by Aligning Features [72.99394941348757]
We propose a novel scheme to Condense dataset by Aligning FEatures (CAFE)
At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales.
We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art.
arXiv Detail & Related papers (2022-03-03T05:58:49Z) - Leveraging Ensembles and Self-Supervised Learning for Fully-Unsupervised
Person Re-Identification and Text Authorship Attribution [77.85461690214551]
Learning from fully-unlabeled data is challenging in Multimedia Forensics problems, such as Person Re-Identification and Text Authorship Attribution.
Recent self-supervised learning methods have shown to be effective when dealing with fully-unlabeled data in cases where the underlying classes have significant semantic differences.
We propose a strategy to tackle Person Re-Identification and Text Authorship Attribution by enabling learning from unlabeled data even when samples from different classes are not prominently diverse.
arXiv Detail & Related papers (2022-02-07T13:08:11Z) - Smart(Sampling)Augment: Optimal and Efficient Data Augmentation for
Semantic Segmentation [68.8204255655161]
We provide the first study on semantic image segmentation and introduce two new approaches: textitSmartAugment and textitSmartSamplingAugment.
SmartAugment uses Bayesian Optimization to search over a rich space of augmentation strategies and achieves a new state-of-the-art performance in all semantic segmentation tasks we consider.
SmartSamplingAugment, a simple parameter-free approach with a fixed augmentation strategy competes in performance with the existing resource-intensive approaches and outperforms cheap state-of-the-art data augmentation methods.
arXiv Detail & Related papers (2021-10-31T13:04:45Z) - Handling Data Heterogeneity with Generative Replay in Collaborative
Learning for Medical Imaging [21.53220262343254]
We present a novel generative replay strategy to address the challenge of data heterogeneity in collaborative learning methods.
A primary model learns the desired task, and an auxiliary "generative replay model" either synthesizes images that closely resemble the input images or helps extract latent variables.
The generative replay strategy is flexible to use, can either be incorporated into existing collaborative learning methods to improve their capability of handling data heterogeneity across institutions, or be used as a novel and individual collaborative learning framework (termed FedReplay) to reduce communication cost.
arXiv Detail & Related papers (2021-06-24T17:39:55Z) - Siloed Federated Learning for Multi-Centric Histopathology Datasets [0.17842332554022694]
This paper proposes a novel federated learning approach for deep learning architectures in the medical domain.
Local-statistic batch normalization (BN) layers are introduced, resulting in collaboratively-trained, yet center-specific models.
We benchmark the proposed method on the classification of tumorous histopathology image patches extracted from the Camelyon16 and Camelyon17 datasets.
arXiv Detail & Related papers (2020-08-17T15:49:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.