ProtoMol: Enhancing Molecular Property Prediction via Prototype-Guided Multimodal Learning
- URL: http://arxiv.org/abs/2510.16824v1
- Date: Sun, 19 Oct 2025 13:19:37 GMT
- Title: ProtoMol: Enhancing Molecular Property Prediction via Prototype-Guided Multimodal Learning
- Authors: Yingxu Wang, Kunyu Zhang, Jiaxin Huang, Nan Yin, Siwei Liu, Eran Segal,
- Abstract summary: ProtoMol is a prototype-guided framework that enables fine-grained integration and consistent semantic alignment between modalities.<n>ProtoMol consistently outperforms state-of-the-art baselines across a variety of molecular property prediction tasks.
- Score: 14.289447310645878
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multimodal molecular representation learning, which jointly models molecular graphs and their textual descriptions, enhances predictive accuracy and interpretability by enabling more robust and reliable predictions of drug toxicity, bioactivity, and physicochemical properties through the integration of structural and semantic information. However, existing multimodal methods suffer from two key limitations: (1) they typically perform cross-modal interaction only at the final encoder layer, thus overlooking hierarchical semantic dependencies; (2) they lack a unified prototype space for robust alignment between modalities. To address these limitations, we propose ProtoMol, a prototype-guided multimodal framework that enables fine-grained integration and consistent semantic alignment between molecular graphs and textual descriptions. ProtoMol incorporates dual-branch hierarchical encoders, utilizing Graph Neural Networks to process structured molecular graphs and Transformers to encode unstructured texts, resulting in comprehensive layer-wise representations. Then, ProtoMol introduces a layer-wise bidirectional cross-modal attention mechanism that progressively aligns semantic features across layers. Furthermore, a shared prototype space with learnable, class-specific anchors is constructed to guide both modalities toward coherent and discriminative representations. Extensive experiments on multiple benchmark datasets demonstrate that ProtoMol consistently outperforms state-of-the-art baselines across a variety of molecular property prediction tasks.
Related papers
- Hierarchical Molecular Representation Learning via Fragment-Based Self-Supervised Embedding Prediction [7.459632891054827]
Graph self-supervised learning (GSSL) has demonstrated strong potential for generating expressive graph embeddings without the need for human annotations.<n>GraSPNet is a hierarchical self-supervised framework that explicitly models both atomic-level and fragment-level semantics.<n>GraSPNet learns chemically meaningful representations and consistently outperforms state-of-the-art GSSL methods in transfer learning settings.
arXiv Detail & Related papers (2026-02-23T20:41:44Z) - Learning Cell-Aware Hierarchical Multi-Modal Representations for Robust Molecular Modeling [74.25438319700929]
We propose CHMR (Cell-aware Hierarchical Multi-modal Representations), a robust framework that models local-global dependencies between molecules and cellular responses.<n> evaluated on nine public benchmarks spanning 728 tasks, CHMR outperforms state-of-the-art baselines.<n>Results demonstrate the advantage of hierarchy-aware, multimodal learning for reliable and biologically grounded molecular representations.
arXiv Detail & Related papers (2025-11-26T07:15:00Z) - Divide, Conquer and Unite: Hierarchical Style-Recalibrated Prototype Alignment for Federated Medical Image Segmentation [66.82598255715696]
Federated learning enables multiple medical institutions to train a global model without sharing data.<n>Current approaches primarily focus on final-layer features, overlooking critical multi-level cues.<n>We propose FedBCS to bridge feature representation gaps via domain-invariant contextual prototypes alignment.
arXiv Detail & Related papers (2025-11-14T04:15:34Z) - $\text{M}^{2}$LLM: Multi-view Molecular Representation Learning with Large Language Models [59.125833618091846]
We propose a multi-view framework that integrates three perspectives: the molecular structure view, the molecular task view, and the molecular rules view.<n>Experiments demonstrate that $textM2$LLM achieves state-of-the-art performance on multiple benchmarks across classification and regression tasks.
arXiv Detail & Related papers (2025-08-12T05:46:47Z) - Multi-Level Fusion Graph Neural Network for Molecule Property Prediction [8.629821238312621]
We propose a Multi-Level Fusion Graph Neural Network (MLFGNN) that integrates Graph Attention Networks and a novel Graph Transformer.<n>Experiments on multiple benchmark datasets demonstrate that MLFGNN consistently outperforms state-of-the-art methods in both classification and regression tasks.
arXiv Detail & Related papers (2025-07-04T09:38:19Z) - AdaptMol: Adaptive Fusion from Sequence String to Topological Structure for Few-shot Drug Discovery [7.338199946027998]
We present AdaptMol, a prototypical network integrating Adaptive multimodal fusion for representation.<n>This framework employs a dual-level attention mechanism to dynamically integrate global and local molecular features.<n>Experiments on three commonly used benchmarks under 5-shot and 10-shot settings demonstrate that AdaptMol achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-05-17T07:12:12Z) - MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention [57.044719143401664]
Histopathology and transcriptomics are fundamental modalities in oncology, encapsulating the morphological and molecular aspects of the disease.<n>We present MIRROR, a novel multi-modal representation learning method designed to foster both modality alignment and retention.<n>Extensive evaluations on TCGA cohorts for cancer subtyping and survival analysis highlight MIRROR's superior performance.
arXiv Detail & Related papers (2025-03-01T07:02:30Z) - Knowledge-aware contrastive heterogeneous molecular graph learning [77.94721384862699]
We propose a paradigm shift by encoding molecular graphs into Heterogeneous Molecular Graph Learning (KCHML)<n>KCHML conceptualizes molecules through three distinct graph views-molecular, elemental, and pharmacological-enhanced by heterogeneous molecular graphs and a dual message-passing mechanism.<n>This design offers a comprehensive representation for property prediction, as well as for downstream tasks such as drug-drug interaction (DDI) prediction.
arXiv Detail & Related papers (2025-02-17T11:53:58Z) - Adapting Differential Molecular Representation with Hierarchical Prompts for Multi-label Property Prediction [2.344198904343022]
HiPM stands for hierarchical prompted molecular representation learning framework.
Our framework comprises two core components: the Molecular Representation (MRE) and the Task-Aware Prompter (TAP)
arXiv Detail & Related papers (2024-05-29T03:10:21Z) - Contrastive Dual-Interaction Graph Neural Network for Molecular Property Prediction [0.0]
We introduce DIG-Mol, a novel self-supervised graph neural network framework for molecular property prediction.
DIG-Mol integrates a momentum distillation network with two interconnected networks to efficiently improve molecular characterization.
We have established DIG-Mol's state-of-the-art performance through extensive experimental evaluation in a variety of molecular property prediction tasks.
arXiv Detail & Related papers (2024-05-04T10:09:27Z) - Multi-View Graph Neural Networks for Molecular Property Prediction [67.54644592806876]
We present Multi-View Graph Neural Network (MV-GNN), a multi-view message passing architecture.
In MV-GNN, we introduce a shared self-attentive readout component and disagreement loss to stabilize the training process.
We further boost the expressive power of MV-GNN by proposing a cross-dependent message passing scheme.
arXiv Detail & Related papers (2020-05-17T04:46:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.