Enhancing Activity Prediction Models in Drug Discovery with the Ability
to Understand Human Language
- URL: http://arxiv.org/abs/2303.03363v2
- Date: Fri, 16 Jun 2023 09:59:34 GMT
- Title: Enhancing Activity Prediction Models in Drug Discovery with the Ability
to Understand Human Language
- Authors: Philipp Seidl, Andreu Vall, Sepp Hochreiter, G\"unter Klambauer
- Abstract summary: We envision a novel type of activity prediction model that is able to adapt to new prediction tasks at inference time.
Our method CLAMP yields improved predictive performance on few-shot learning benchmarks and zero-shot problems in drug discovery.
- Score: 5.117101148161245
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Activity and property prediction models are the central workhorses in drug
discovery and materials sciences, but currently they have to be trained or
fine-tuned for new tasks. Without training or fine-tuning, scientific language
models could be used for such low-data tasks through their announced zero- and
few-shot capabilities. However, their predictive quality at activity prediction
is lacking. In this work, we envision a novel type of activity prediction model
that is able to adapt to new prediction tasks at inference time, via
understanding textual information describing the task. To this end, we propose
a new architecture with separate modules for chemical and natural language
inputs, and a contrastive pre-training objective on data from large biochemical
databases. In extensive experiments, we show that our method CLAMP yields
improved predictive performance on few-shot learning benchmarks and zero-shot
problems in drug discovery. We attribute the advances of our method to the
modularized architecture and to our pre-training objective.
Related papers
- Can training neural language models on a curriculum with developmentally
plausible data improve alignment with human reading behavior? [0.2745342790938508]
This paper explores the extent to which the misalignment between empirical and model-predicted behavior can be minimized by training models on more developmentally plausible data.
We trained teacher language models on the BabyLM "strict-small" dataset and used sentence level surprisal estimates from these teacher models to create a curriculum.
We found tentative evidence that our curriculum made it easier for models to acquire linguistic knowledge from the training data.
arXiv Detail & Related papers (2023-11-30T18:03:58Z) - Learning Objective-Specific Active Learning Strategies with Attentive
Neural Processes [72.75421975804132]
Learning Active Learning (LAL) suggests to learn the active learning strategy itself, allowing it to adapt to the given setting.
We propose a novel LAL method for classification that exploits symmetry and independence properties of the active learning problem.
Our approach is based on learning from a myopic oracle, which gives our model the ability to adapt to non-standard objectives.
arXiv Detail & Related papers (2023-09-11T14:16:37Z) - On Data Imbalance in Molecular Property Prediction with Pre-training [16.211138511816642]
A technique called pre-training is used to improve the accuracy of machine learning models.
Pre-training involves training the model on pretext task, which is different from the target task, before training the model on the target task.
In this study, we propose an effective pre-training method that addresses the imbalance in input data.
arXiv Detail & Related papers (2023-08-17T12:04:14Z) - Is Self-Supervised Pretraining Good for Extrapolation in Molecular
Property Prediction? [16.211138511816642]
In material science, the prediction of unobserved values, commonly referred to as extrapolation, is critical for property prediction.
We propose an experimental framework for the demonstration and empirically reveal that while models were unable to accurately extrapolate absolute property values, self-supervised pretraining enables them to learn relative tendencies of unobserved property values.
arXiv Detail & Related papers (2023-08-16T03:38:43Z) - SINC: Self-Supervised In-Context Learning for Vision-Language Tasks [64.44336003123102]
We propose a framework to enable in-context learning in large language models.
A meta-model can learn on self-supervised prompts consisting of tailored demonstrations.
Experiments show that SINC outperforms gradient-based methods in various vision-language tasks.
arXiv Detail & Related papers (2023-07-15T08:33:08Z) - Concept-aware Training Improves In-context Learning Ability of Language
Models [0.0]
Many recent language models (LMs) of Transformers family exhibit so-called in-context learning (ICL) ability.
We propose a method to create LMs able to better utilize the in-context information.
We measure that data sampling of Concept-aware Training consistently improves models' reasoning ability.
arXiv Detail & Related papers (2023-05-23T07:44:52Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - Tyger: Task-Type-Generic Active Learning for Molecular Property
Prediction [121.97742787439546]
How to accurately predict the properties of molecules is an essential problem in AI-driven drug discovery.
To reduce annotation cost, deep Active Learning methods are developed to select only the most representative and informative data for annotating.
We propose a Task-type-generic active learning framework (termed Tyger) that is able to handle different types of learning tasks in a unified manner.
arXiv Detail & Related papers (2022-05-23T12:56:12Z) - BERT WEAVER: Using WEight AVERaging to enable lifelong learning for
transformer-based models in biomedical semantic search engines [49.75878234192369]
We present WEAVER, a simple, yet efficient post-processing method that infuses old knowledge into the new model.
We show that applying WEAVER in a sequential manner results in similar word embedding distributions as doing a combined training on all data at once.
arXiv Detail & Related papers (2022-02-21T10:34:41Z) - Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials.
We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z) - A Survey on Recent Approaches for Natural Language Processing in
Low-Resource Scenarios [30.391291221959545]
Deep neural networks and huge language models are becoming omnipresent in natural language applications.
As they are known for requiring large amounts of training data, there is a growing body of work to improve the performance in low-resource settings.
Motivated by the recent fundamental changes towards neural models and the popular pre-train and fine-tune paradigm, we survey promising approaches for low-resource natural language processing.
arXiv Detail & Related papers (2020-10-23T11:22:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.