Related papers: Multitask finetuning and acceleration of chemical pretrained models for small molecule drug property prediction

Multitask finetuning and acceleration of chemical pretrained models for small molecule drug property prediction

URL: http://arxiv.org/abs/2510.12719v1
Date: Tue, 14 Oct 2025 16:58:39 GMT
Title: Multitask finetuning and acceleration of chemical pretrained models for small molecule drug property prediction
Authors: Matthew Adrian, Yunsie Chung, Kevin Boyd, Saee Paliwal, Srimukh Prasad Veccham, Alan C. Cheng,
Abstract summary: Multi-task learning has previously been successfully leveraged to improve predictive models.<n>We show that enabling multitasking in finetuning of chemical pretrained graph neural network models significantly improves performance.<n>We publish two multitask ADMET data splits to enable more accurate benchmarking of multitask deep learning methods for drug property prediction.
Score: 1.1391158217994781
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Chemical pretrained models, sometimes referred to as foundation models, are receiving considerable interest for drug discovery applications. The general chemical knowledge extracted from self-supervised training has the potential to improve predictions for critical drug discovery endpoints, including on-target potency and ADMET properties. Multi-task learning has previously been successfully leveraged to improve predictive models. Here, we show that enabling multitasking in finetuning of chemical pretrained graph neural network models such as Kinetic GROVER Multi-Task (KERMT), an enhanced version of the GROVER model, and Knowledge-guided Pre-training of Graph Transformer (KGPT) significantly improves performance over non-pretrained graph neural network models. Surprisingly, we find that the performance improvement from finetuning KERMT in a multitask manner is most significant at larger data sizes. Additionally, we publish two multitask ADMET data splits to enable more accurate benchmarking of multitask deep learning methods for drug property prediction. Finally, we provide an accelerated implementation of the KERMT model on GitHub, unlocking large-scale pretraining, finetuning, and inference in industrial drug discovery workflows.

Related papers

Zatom-1: A Multimodal Flow Foundation Model for 3D Molecules and Materials [51.342983349686556]
General-purpose 3D chemical modeling encompasses molecules and materials, requiring both generative and predictive capabilities.<n>We introduce Zatom-1, the first end-to-end, fully open-source foundation model that unifies generative and predictive learning of 3D molecules and materials.
arXiv Detail & Related papers (2026-02-24T20:52:39Z)
Foundation Models for Discovery and Exploration in Chemical Space [57.97784111110166]
MIST is a family of molecular foundation models trained on large unlabeled datasets.<n>We demonstrate the ability of these models to solve real-world problems across chemical space.
arXiv Detail & Related papers (2025-10-20T17:56:01Z)
Quantum-Enhanced Multi-Task Learning with Learnable Weighting for Pharmacokinetic and Toxicity Prediction [10.487649921110611]
We propose a new unified Quantum-enhanced and task-Weighted Multi-Task Learning framework, specifically designed for ADMET classification tasks.<n>QW-MTL adopts quantum chemical descriptors to enrich molecular representations with additional information about the electronic structure and interactions.<n>It introduces a novel exponential task weighting scheme that combines dataset-scale priors with learnable parameters to achieve dynamic loss balancing across tasks.
arXiv Detail & Related papers (2025-09-04T18:33:40Z)
MEPT: Mixture of Expert Prompt Tuning as a Manifold Mapper [75.6582687942241]
We propose Mixture of Expert Prompt Tuning (MEPT) as an effective and efficient manifold-mapping framework.<n>MEPT integrates multiple prompt experts to adaptively learn diverse and non-stationary data distributions.<n> Empirical evaluations demonstrate that MEPT outperforms several state-of-the-art parameter efficient baselines on SuperGLUE.
arXiv Detail & Related papers (2025-08-31T21:19:25Z)
All You Need Is Synthetic Task Augmentation [0.0]
In our study, we propose a novel strategy that jointly trains a single Graph Transformer neural network on both sparse multitask molecular property experimental targets and synthetic targets.<n>Our results show consistent and significant performance improvement across all 19 molecular property prediction tasks.
arXiv Detail & Related papers (2025-05-15T09:46:27Z)
DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks [61.16389024252561]
We develop a robust generalist perception model capable of addressing multiple tasks under constraints of computational resources and limited training data.<n>We leverage text-to-image diffusion models pre-trained on billions of images and successfully introduce our DICEPTION, a visual generalist model.<n> Exhaustive evaluations demonstrate that DICEPTION effectively tackles diverse perception tasks, even achieving performance comparable to SOTA single-task specialist models.
arXiv Detail & Related papers (2025-02-24T13:51:06Z)
Machine Learning Small Molecule Properties in Drug Discovery [44.62264781248437]
We review a wide range of properties, including binding affinities, solubility, and ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) We discuss existing popular descriptors and embeddings, such as chemical fingerprints and graph-based neural networks. Finally, techniques to provide an understanding of model predictions, especially for critical decision-making in drug discovery are assessed.
arXiv Detail & Related papers (2023-08-02T22:18:41Z)
Does GNN Pretraining Help Molecular Representation? [5.5459878275267736]
Self-supervised graph pretraining does not have statistically significant advantages over non-pretraining methods in many settings. Although improvement can be observed with additional supervised pretraining, the improvement may diminish with richer features or more balanced data splits. We hypothesize the complexity of pretraining on molecules is insufficient, leading to less transferable knowledge for downstream tasks.
arXiv Detail & Related papers (2022-07-13T07:34:16Z)
SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity Prediction [127.43571146741984]
Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery. wet experiments remain the most reliable method, but they are time-consuming and resource-intensive. Existing methods have primarily focused on developing techniques based on the available DTA data, without adequately addressing the data scarcity issue. We present the SSM-DTA framework, which incorporates three simple yet highly effective strategies.
arXiv Detail & Related papers (2022-06-20T14:53:25Z)
Unsupervised Pre-Training on Patient Population Graphs for Patient-Level Predictions [48.02011627390706]
Pre-training has shown success in different areas of machine learning, such as Computer Vision (CV), Natural Language Processing (NLP) and medical imaging. In this paper, we apply unsupervised pre-training to heterogeneous, multi-modal EHR data for patient outcome prediction. We find that our proposed graph based pre-training method helps in modeling the data at a population level.
arXiv Detail & Related papers (2022-03-23T17:59:45Z)
Improving Molecular Representation Learning with Metric Learning-enhanced Optimal Transport [49.237577649802034]
We develop a novel optimal transport-based algorithm termed MROT to enhance their generalization capability for molecular regression problems. MROT significantly outperforms state-of-the-art models, showing promising potential in accelerating the discovery of new substances.
arXiv Detail & Related papers (2022-02-13T04:56:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.