Related papers: Knowledge-enhanced Relation Graph and Task Sampling for Few-shot Molecular Property Prediction

Knowledge-enhanced Relation Graph and Task Sampling for Few-shot Molecular Property Prediction

URL: http://arxiv.org/abs/2405.15544v1
Date: Fri, 24 May 2024 13:31:19 GMT
Title: Knowledge-enhanced Relation Graph and Task Sampling for Few-shot Molecular Property Prediction
Authors: Zeyu Wang, Tianyi Jiang, Yao Lu, Xiaoze Bao, Shanqing Yu, Bin Wei, Qi Xuan,
Abstract summary: This paper proposes a novel meta-learning FSMPP framework (KRGTS) KRGTS comprises the Knowledge-enhanced Relation Graph module and the Task Sampling module. Empirically, extensive experiments on five datasets demonstrate the superiority of KRGTS over a variety of state-of-the-art methods.
Score: 7.302312984575165
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recently, few-shot molecular property prediction (FSMPP) has garnered increasing attention. Despite impressive breakthroughs achieved by existing methods, they often overlook the inherent many-to-many relationships between molecules and properties, which limits their performance. For instance, similar substructures of molecules can inspire the exploration of new compounds. Additionally, the relationships between properties can be quantified, with high-related properties providing more information in exploring the target property than those low-related. To this end, this paper proposes a novel meta-learning FSMPP framework (KRGTS), which comprises the Knowledge-enhanced Relation Graph module and the Task Sampling module. The knowledge-enhanced relation graph module constructs the molecule-property multi-relation graph (MPMRG) to capture the many-to-many relationships between molecules and properties. The task sampling module includes a meta-training task sampler and an auxiliary task sampler, responsible for scheduling the meta-training process and sampling high-related auxiliary tasks, respectively, thereby achieving efficient meta-knowledge learning and reducing noise introduction. Empirically, extensive experiments on five datasets demonstrate the superiority of KRGTS over a variety of state-of-the-art methods. The code is available in https://github.com/Vencent-Won/KRGTS-public.

Related papers

Graph-based Molecular In-context Learning Grounded on Morgan Fingerprints [28.262593876388397]
In-context learning (ICL) conditions large language models (LLMs) for molecular tasks, such as property prediction and molecule captioning, by embedding carefully selected demonstration examples into the input prompt. However, current prompt retrieval methods for molecular tasks have relied on molecule feature similarity, such as Morgan fingerprints, which do not adequately capture the global molecular and atom-binding relationships. We propose a self-supervised learning technique, GAMIC, which aligns global molecular structures, represented by graph neural networks (GNNs), with textual captions (descriptions) while leveraging local feature similarity through Morgan fingerprints.
arXiv Detail & Related papers (2025-02-08T02:46:33Z)
FARM: Functional Group-Aware Representations for Small Molecules [55.281754551202326]
We introduce Functional Group-Aware Representations for Small Molecules (FARM) FARM is a foundation model designed to bridge the gap between SMILES, natural language, and molecular graphs. We rigorously evaluate FARM on the MoleculeNet dataset, where it achieves state-of-the-art performance on 10 out of 12 tasks.
arXiv Detail & Related papers (2024-10-02T23:04:58Z)
Adapting Differential Molecular Representation with Hierarchical Prompts for Multi-label Property Prediction [2.344198904343022]
HiPM stands for hierarchical prompted molecular representation learning framework. Our framework comprises two core components: the Molecular Representation (MRE) and the Task-Aware Prompter (TAP)
arXiv Detail & Related papers (2024-05-29T03:10:21Z)
Data-Efficient Molecular Generation with Hierarchical Textual Inversion [48.816943690420224]
We introduce Hierarchical textual Inversion for Molecular generation (HI-Mol), a novel data-efficient molecular generation method. HI-Mol is inspired by the importance of hierarchical information, e.g., both coarse- and fine-grained features, in understanding the molecule distribution. Compared to the conventional textual inversion method in the image domain using a single-level token embedding, our multi-level token embeddings allow the model to effectively learn the underlying low-shot molecule distribution.
arXiv Detail & Related papers (2024-05-05T08:35:23Z)
Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model [49.64512917330373]
We introduce a multi-constraint molecular generation large language model, TSMMG, akin to a student. To train TSMMG, we construct a large set of text-molecule pairs by extracting molecular knowledge from these 'teachers' We experimentally show that TSMMG remarkably performs in generating molecules meeting complex, natural language-described property requirements.
arXiv Detail & Related papers (2024-03-20T02:15:55Z)
MultiModal-Learning for Predicting Molecular Properties: A Framework Based on Image and Graph Structures [2.5563339057415218]
MolIG is a novel MultiModaL molecular pre-training framework for predicting molecular properties based on Image and Graph structures. It amalgamates the strengths of both molecular representation forms. It exhibits enhanced performance in downstream tasks pertaining to molecular property prediction within benchmark groups.
arXiv Detail & Related papers (2023-11-28T10:28:35Z)
Bi-level Contrastive Learning for Knowledge-Enhanced Molecule Representations [55.42602325017405]
We propose a novel method called GODE, which takes into account the two-level structure of individual molecules. By pre-training two graph neural networks (GNNs) on different graph structures, combined with contrastive learning, GODE fuses molecular structures with their corresponding knowledge graph substructures. When fine-tuned across 11 chemical property tasks, our model outperforms existing benchmarks, registering an average ROC-AUC uplift of 13.8% for classification tasks and an average RMSE/MAE enhancement of 35.1% for regression tasks.
arXiv Detail & Related papers (2023-06-02T15:49:45Z)
Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction. Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations. On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z)
t-SMILES: A Scalable Fragment-based Molecular Representation Framework for De Novo Molecule Generation [9.116670221263753]
This study introduces a flexible, fragment-based, multiscale molecular representation framework called t-SMILES. It describes molecules using SMILES-type strings obtained by performing a breadth-first search on a full binary tree formed from a fragmented molecular graph. It significantly outperforms classical SMILES, DeepSMILES, SELFIES and baseline models in goal-directed tasks.
arXiv Detail & Related papers (2023-01-04T21:41:01Z)
Tyger: Task-Type-Generic Active Learning for Molecular Property Prediction [121.97742787439546]
How to accurately predict the properties of molecules is an essential problem in AI-driven drug discovery. To reduce annotation cost, deep Active Learning methods are developed to select only the most representative and informative data for annotating. We propose a Task-type-generic active learning framework (termed Tyger) that is able to handle different types of learning tasks in a unified manner.
arXiv Detail & Related papers (2022-05-23T12:56:12Z)
Structured Multi-task Learning for Molecular Property Prediction [30.77287550003828]
We study multi-task learning for molecular property prediction in a novel setting, where a relation graph between tasks is available. In the emphlatent space, we model the task representations by applying a state graph neural network (SGNN) on the relation graph. In the emphoutput space, we employ structured prediction with the energy-based model (EBM), which can be efficiently trained through noise-contrastive estimation (NCE) approach.
arXiv Detail & Related papers (2022-02-22T20:31:23Z)
Few-Shot Graph Learning for Molecular Property Prediction [46.60746023179724]
We propose Meta-MGNN, a novel model for few-shot molecular property prediction. To exploit unlabeled molecular information, Meta-MGNN further incorporates molecular structure, attribute based self-supervised modules and self-attentive task weights. Extensive experiments on two public multi-property datasets demonstrate that Meta-MGNN outperforms a variety of state-of-the-art methods.
arXiv Detail & Related papers (2021-02-16T01:55:34Z)
ASGN: An Active Semi-supervised Graph Neural Network for Molecular Property Prediction [61.33144688400446]
We propose a novel framework called Active Semi-supervised Graph Neural Network (ASGN) by incorporating both labeled and unlabeled molecules. In the teacher model, we propose a novel semi-supervised learning method to learn general representation that jointly exploits information from molecular structure and molecular distribution. At last, we proposed a novel active learning strategy in terms of molecular diversities to select informative data during the whole framework learning.
arXiv Detail & Related papers (2020-07-07T04:22:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.