UniMatch: Universal Matching from Atom to Task for Few-Shot Drug Discovery
- URL: http://arxiv.org/abs/2502.12453v1
- Date: Tue, 18 Feb 2025 02:36:03 GMT
- Title: UniMatch: Universal Matching from Atom to Task for Few-Shot Drug Discovery
- Authors: Ruifeng Li, Mingqian Li, Wei Liu, Yuhua Zhou, Xiangxin Zhou, Yuan Yao, Qiang Zhang, Hongyang Chen,
- Abstract summary: We introduce Universal Matching Networks (UniMatch), a dual matching framework that integrates explicit hierarchical molecular matching with implicit task-level matching.<n>Specifically, our approach captures structural features across multiple levels, such as atoms, substructures, and molecules, via hierarchical pooling and matching.<n>Our experimental results demonstrate that UniMatch outperforms state-of-the-art methods on the MoleculeNet and FS-Mol benchmarks.
- Score: 24.39705006290841
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Drug discovery is crucial for identifying candidate drugs for various diseases.However, its low success rate often results in a scarcity of annotations, posing a few-shot learning problem. Existing methods primarily focus on single-scale features, overlooking the hierarchical molecular structures that determine different molecular properties. To address these issues, we introduce Universal Matching Networks (UniMatch), a dual matching framework that integrates explicit hierarchical molecular matching with implicit task-level matching via meta-learning, bridging multi-level molecular representations and task-level generalization. Specifically, our approach explicitly captures structural features across multiple levels, such as atoms, substructures, and molecules, via hierarchical pooling and matching, facilitating precise molecular representation and comparison. Additionally, we employ a meta-learning strategy for implicit task-level matching, allowing the model to capture shared patterns across tasks and quickly adapt to new ones. This unified matching framework ensures effective molecular alignment while leveraging shared meta-knowledge for fast adaptation. Our experimental results demonstrate that UniMatch outperforms state-of-the-art methods on the MoleculeNet and FS-Mol benchmarks, achieving improvements of 2.87% in AUROC and 6.52% in delta AUPRC. UniMatch also shows excellent generalization ability on the Meta-MolNet benchmark.
Related papers
- Semantic-Aligned Learning with Collaborative Refinement for Unsupervised VI-ReID [82.12123628480371]
Unsupervised person re-identification (USL-VI-ReID) seeks to match pedestrian images of the same individual across different modalities without human annotations for model learning.
Previous methods unify pseudo-labels of cross-modality images through label association algorithms and then design contrastive learning framework for global feature learning.
We propose a Semantic-Aligned Learning with Collaborative Refinement (SALCR) framework, which builds up objective for specific fine-grained patterns emphasized by each modality.
arXiv Detail & Related papers (2025-04-27T13:58:12Z) - Graph-based Molecular In-context Learning Grounded on Morgan Fingerprints [28.262593876388397]
In-context learning (ICL) conditions large language models (LLMs) for molecular tasks, such as property prediction and molecule captioning, by embedding carefully selected demonstration examples into the input prompt.<n>However, current prompt retrieval methods for molecular tasks have relied on molecule feature similarity, such as Morgan fingerprints, which do not adequately capture the global molecular and atom-binding relationships.<n>We propose a self-supervised learning technique, GAMIC, which aligns global molecular structures, represented by graph neural networks (GNNs), with textual captions (descriptions) while leveraging local feature similarity through Morgan fingerprints.
arXiv Detail & Related papers (2025-02-08T02:46:33Z) - GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning [51.677086019209554]
We propose a Generalized Structural Sparse to capture powerful relationships across modalities for pair-wise similarity learning.
The distance metric delicately encapsulates two formats of diagonal and block-diagonal terms.
Experiments on cross-modal and two extra uni-modal retrieval tasks have validated its superiority and flexibility.
arXiv Detail & Related papers (2024-10-20T03:45:50Z) - Adapting Differential Molecular Representation with Hierarchical Prompts for Multi-label Property Prediction [2.344198904343022]
HiPM stands for hierarchical prompted molecular representation learning framework.
Our framework comprises two core components: the Molecular Representation (MRE) and the Task-Aware Prompter (TAP)
arXiv Detail & Related papers (2024-05-29T03:10:21Z) - Atomas: Hierarchical Alignment on Molecule-Text for Unified Molecule Understanding and Generation [42.08917809689811]
Cross-modal representation learning has emerged as a promising direction for enhancing the quality of molecular representation.
We propose Atomas, a hierarchical molecular representation learning framework that jointly learns representations from SMILES strings and text.
Atomas achieves superior performance across 12 tasks on 11 datasets, outperforming 11 baseline models.
arXiv Detail & Related papers (2024-04-23T12:35:44Z) - CPCL: Cross-Modal Prototypical Contrastive Learning for Weakly
Supervised Text-based Person Re-Identification [10.64115914599574]
Weakly supervised text-based person re-identification (TPRe-ID) seeks to retrieve images of a target person using textual descriptions.
The primary challenge is the intra-class differences, encompassing intra-modal feature variations and cross-modal semantic gaps.
In practice, the CPCL introduces the CLIP model to weakly supervised TPRe-ID for the first time, mapping visual and textual instances into a shared latent space.
arXiv Detail & Related papers (2024-01-18T14:27:01Z) - UniMAP: Universal SMILES-Graph Representation Learning [21.25038529787392]
We propose a universal SMILE-graph representation learning model, namely UniMAP.
Four kinds of pre-training tasks are designed for UniMAP, including Multi-Level Cross-Modality Masking (CMM), SMILES-Graph Matching (SGM), Fragment-Level Alignment (FLA), and Domain Knowledge Learning (DKL)
Experimental results show that UniMAP outperforms current state-of-the-art pre-training methods.
arXiv Detail & Related papers (2023-10-22T07:48:33Z) - Learning Invariant Molecular Representation in Latent Discrete Space [52.13724532622099]
We propose a new framework for learning molecular representations that exhibit invariance and robustness against distribution shifts.
Our model achieves stronger generalization against state-of-the-art baselines in the presence of various distribution shifts.
arXiv Detail & Related papers (2023-10-22T04:06:44Z) - Universal Information Extraction as Unified Semantic Matching [54.19974454019611]
We decouple information extraction into two abilities, structuring and conceptualizing, which are shared by different tasks and schemas.
Based on this paradigm, we propose to universally model various IE tasks with Unified Semantic Matching framework.
In this way, USM can jointly encode schema and input text, uniformly extract substructures in parallel, and controllably decode target structures on demand.
arXiv Detail & Related papers (2023-01-09T11:51:31Z) - Set-based Meta-Interpolation for Few-Task Meta-Learning [79.4236527774689]
We propose a novel domain-agnostic task augmentation method, Meta-Interpolation, to densify the meta-training task distribution.
We empirically validate the efficacy of Meta-Interpolation on eight datasets spanning across various domains.
arXiv Detail & Related papers (2022-05-20T06:53:03Z) - Meta-Learning with Fewer Tasks through Task Interpolation [67.03769747726666]
Current meta-learning algorithms require a large number of meta-training tasks, which may not be accessible in real-world scenarios.
By meta-learning with task gradient (MLTI), our approach effectively generates additional tasks by randomly sampling a pair of tasks and interpolating the corresponding features and labels.
Empirically, in our experiments on eight datasets from diverse domains, we find that the proposed general MLTI framework is compatible with representative meta-learning algorithms and consistently outperforms other state-of-the-art strategies.
arXiv Detail & Related papers (2021-06-04T20:15:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.