Adaptive Invariance for Molecule Property Prediction
- URL: http://arxiv.org/abs/2005.03004v1
- Date: Tue, 5 May 2020 19:47:20 GMT
- Title: Adaptive Invariance for Molecule Property Prediction
- Authors: Wengong Jin, Regina Barzilay, Tommi Jaakkola
- Abstract summary: We introduce a novel approach to learn predictors that can generalize or extrapolate beyond the heterogeneous data.
Our method builds on and extends recently proposed invariant risk minimization.
Our predictor outperforms state-of-the-art transfer learning methods by significant margin.
- Score: 38.637412590671865
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Effective property prediction methods can help accelerate the search for
COVID-19 antivirals either through accurate in-silico screens or by effectively
guiding on-going at-scale experimental efforts. However, existing prediction
tools have limited ability to accommodate scarce or fragmented training data
currently available. In this paper, we introduce a novel approach to learn
predictors that can generalize or extrapolate beyond the heterogeneous data.
Our method builds on and extends recently proposed invariant risk minimization,
adaptively forcing the predictor to avoid nuisance variation. We achieve this
by continually exercising and manipulating latent representations of molecules
to highlight undesirable variation to the predictor. To test the method we use
a combination of three data sources: SARS-CoV-2 antiviral screening data,
molecular fragments that bind to SARS-CoV-2 main protease and large screening
data for SARS-CoV-1. Our predictor outperforms state-of-the-art transfer
learning methods by significant margin. We also report the top 20 predictions
of our model on Broad drug repurposing hub.
Related papers
- Causal Lifting of Neural Representations: Zero-Shot Generalization for Causal Inferences [56.23412698865433]
We focus on causal inferences on a target experiment with unlabeled factual outcomes, retrieved by a predictive model fine-tuned on a labeled similar experiment.
First, we show that factual outcome estimation via Empirical Risk Minimization (ERM) may fail to yield valid causal inferences on the target population.
We propose Deconfounded Empirical Risk Minimization (DERM), a new simple learning procedure minimizing the risk over a fictitious target population.
arXiv Detail & Related papers (2025-02-10T10:52:17Z) - Deep Neural Network-Based Prediction of B-Cell Epitopes for SARS-CoV and SARS-CoV-2: Enhancing Vaccine Design through Machine Learning [4.728153103738193]
The accurate prediction of B-cells is critical for guiding vaccine development against infectious diseases, including SARS and COVID-19.
Traditional sequence-based methods often struggle with large, complex datasets, but deep learning offers promising improvements in predictive accuracy.
Results indicate an overall accuracy of 82% in predicting COVID-19 negative and positive cases, with room for improvement in detecting positive samples.
arXiv Detail & Related papers (2024-11-28T01:54:43Z) - Permutation invariant multi-output Gaussian Processes for drug combination prediction in cancer [2.1145050293719745]
Dose-response prediction in cancer is an active application field in machine learning.
The goal is to develop accurate predictive models that can be used to guide experimental design or inform treatment decisions.
arXiv Detail & Related papers (2024-06-28T18:28:38Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - Objective-Agnostic Enhancement of Molecule Properties via Multi-Stage
VAE [1.3597551064547502]
Variational autoencoder (VAE) is a popular method for drug discovery and various architectures and pipelines have been proposed to improve its performance.
VAE approaches are known to suffer from poor manifold recovery when the data lie on a low-dimensional manifold embedded in a higher dimensional ambient space.
In this paper, we explore applying a multi-stage VAE approach, that can improve manifold recovery on a synthetic dataset, to the field of drug discovery.
arXiv Detail & Related papers (2023-08-24T20:22:22Z) - Taming Overconfident Prediction on Unlabeled Data from Hindsight [50.9088560433925]
Minimizing prediction uncertainty on unlabeled data is a key factor to achieve good performance in semi-supervised learning.
This paper proposes a dual mechanism, named ADaptive Sharpening (ADS), which first applies a soft-threshold to adaptively mask out determinate and negligible predictions.
ADS significantly improves the state-of-the-art SSL methods by making it a plug-in.
arXiv Detail & Related papers (2021-12-15T15:17:02Z) - Predicting the Binding of SARS-CoV-2 Peptides to the Major
Histocompatibility Complex with Recurrent Neural Networks [0.40040974874482094]
We adapt and extend USMPep, a proposed, conceptually simple prediction algorithm based on recurrent neural networks.
We evaluate the performance on a recently released SARS-CoV-2 dataset with binding stability measurements.
arXiv Detail & Related papers (2021-04-16T17:16:35Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - STELAR: Spatio-temporal Tensor Factorization with Latent Epidemiological
Regularization [76.57716281104938]
We develop a tensor method to predict the evolution of epidemic trends for many regions simultaneously.
STELAR enables long-term prediction by incorporating latent temporal regularization through a system of discrete-time difference equations.
We conduct experiments using both county- and state-level COVID-19 data and show that our model can identify interesting latent patterns of the epidemic.
arXiv Detail & Related papers (2020-12-08T21:21:47Z) - Deep Learning Models for Early Detection and Prediction of the spread of
Novel Coronavirus (COVID-19) [4.213555705835109]
SARS-CoV2 is continuing to spread globally and has become a pandemic.
There is an urgent need to develop machine learning techniques to predict the spread of COVID-19.
arXiv Detail & Related papers (2020-07-29T10:14:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.