Related papers: POEMS: Product of Experts for Interpretable Multi-omic Integration using Sparse Decoding

POEMS: Product of Experts for Interpretable Multi-omic Integration using Sparse Decoding

URL: http://arxiv.org/abs/2511.03464v1
Date: Wed, 05 Nov 2025 13:39:28 GMT
Title: POEMS: Product of Experts for Interpretable Multi-omic Integration using Sparse Decoding
Authors: Mihriban Kocak Balik, Pekka Marttinen, Negar Safinianaini,
Abstract summary: We introduce POEMS: Product Of Experts for Interpretable Multiomics Integration using Sparse Decoding.<n> POEMS provides interpretability without linearizing any part of the network by 1) mapping features to latent factors using sparse connections.<n>In a cancer subtyping case study, POEMS achieves competitive clustering and classification performance.
Score: 10.520179127805187
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Integrating different molecular layers, i.e., multiomics data, is crucial for unraveling the complexity of diseases; yet, most deep generative models either prioritize predictive performance at the expense of interpretability or enforce interpretability by linearizing the decoder, thereby weakening the network's nonlinear expressiveness. To overcome this tradeoff, we introduce POEMS: Product Of Experts for Interpretable Multiomics Integration using Sparse Decoding, an unsupervised probabilistic framework that preserves predictive performance while providing interpretability. POEMS provides interpretability without linearizing any part of the network by 1) mapping features to latent factors using sparse connections, which directly translates to biomarker discovery, 2) allowing for cross-omic associations through a shared latent space using product of experts model, and 3) reporting contributions of each omic by a gating network that adaptively computes their influence in the representation learning. Additionally, we present an efficient sparse decoder. In a cancer subtyping case study, POEMS achieves competitive clustering and classification performance while offering our novel set of interpretations, demonstrating that biomarker based insight and predictive accuracy can coexist in multiomics representation learning.

Related papers

PIME: Prototype-based Interpretable MCTS-Enhanced Brain Network Analysis for Disorder Diagnosis [29.231312772459237]
PIME is an interpretable framework that bridges intrinsic interpretability with minimal-sufficient subgraph optimization.<n> Experiments on three benchmark fMRI datasets demonstrate that PIME achieves state-of-the-art performance.
arXiv Detail & Related papers (2026-02-24T16:04:52Z)
Vision-Language Semantic Aggregation Leveraging Foundation Model for Generalizable Medical Image Segmentation [5.597576681565333]
We propose an Expectation-Maximization (EM) Aggregation mechanism and a Text-Guided Pixel Decoder.<n>The latter is designed to bridge the semantic gap by leveraging domain-invariant textual knowledge to effectively guide deep visual representations.<n>Our method consistently outperforms existing SOTA approaches across multiple domain generalization benchmarks.
arXiv Detail & Related papers (2025-09-10T13:16:30Z)
Multimodal Behavioral Patterns Analysis with Eye-Tracking and LLM-Based Reasoning [12.054910727620154]
Eye-tracking data reveals valuable insights into users' cognitive states but is difficult to analyze due to its structured, non-linguistic nature.<n>This paper presents a multimodal human-AI collaborative framework designed to enhance cognitive pattern extraction from eye-tracking signals.
arXiv Detail & Related papers (2025-07-24T09:49:53Z)
METER: Multi-modal Evidence-based Thinking and Explainable Reasoning -- Algorithm and Benchmark [48.78602579128459]
We introduce METER, a unified benchmark for interpretable forgery detection spanning images, videos, audio, and audio-visual content.<n>Our dataset comprises four tracks, each requiring not only real-vs-fake classification but also evidence-chain-based explanations.
arXiv Detail & Related papers (2025-07-22T03:42:51Z)
Spatial-Temporal-Spectral Unified Modeling for Remote Sensing Dense Prediction [20.1863553357121]
Current deep learning architectures for remote sensing are fundamentally rigid.<n>We introduce the Spatial-Temporal-Spectral Unified Network (STSUN) for unified modeling.<n> STSUN can adapt to input and output data with arbitrary spatial sizes, temporal lengths, and spectral bands.<n>It unifies various dense prediction tasks and diverse semantic class predictions.
arXiv Detail & Related papers (2025-05-18T07:39:17Z)
Hallucination Detection in LLMs with Topological Divergence on Attention Graphs [60.83579255387347]
Hallucination, i.e., generating factually incorrect content, remains a critical challenge for large language models.<n>We introduce TOHA, a TOpology-based HAllucination detector in the RAG setting.
arXiv Detail & Related papers (2025-04-14T10:06:27Z)
A Multi-Modal Deep Learning Framework for Pan-Cancer Prognosis [15.10417643788382]
In this paper, a deep-learning based model, named UMPSNet, is proposed.<n>UMPSNet integrates four types of important meta data (demographic information, cancer type information, treatment protocols, and diagnosis results) into text templates, and then introduces a text encoder to extract textual features.<n>By incorporating the multi-modality of patient data and joint training, UMPSNet outperforms all SOTA approaches.
arXiv Detail & Related papers (2025-01-13T02:29:42Z)
Intrinsic User-Centric Interpretability through Global Mixture of Experts [31.738009841932374]
InterpretCC is a family of intrinsically interpretable neural networks that optimize for ease of human understanding and explanation faithfulness.<n>We show that InterpretCC explanations are found to have higher actionability and usefulness over other intrinsically interpretable approaches.
arXiv Detail & Related papers (2024-02-05T11:55:50Z)
Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation [62.96628432641806]
Scene Graph Generation aims to first encode the visual contents within the given image and then parse them into a compact summary graph. We first present a novel Stacked Hybrid-Attention network, which facilitates the intra-modal refinement as well as the inter-modal interaction. We then devise an innovative Group Collaborative Learning strategy to optimize the decoder.
arXiv Detail & Related papers (2022-03-18T09:14:13Z)
Deep Co-Attention Network for Multi-View Subspace Learning [73.3450258002607]
We propose a deep co-attention network for multi-view subspace learning. It aims to extract both the common information and the complementary information in an adversarial setting. In particular, it uses a novel cross reconstruction loss and leverages the label information to guide the construction of the latent representation.
arXiv Detail & Related papers (2021-02-15T18:46:44Z)
G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers. We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z)
Improve Variational Autoencoder for Text Generationwith Discrete Latent Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning. VAEs tend to ignore latent variables with a strong auto-regressive decoder. We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.