Heterogenous Ensemble of Models for Molecular Property Prediction
- URL: http://arxiv.org/abs/2211.11035v1
- Date: Sun, 20 Nov 2022 17:25:26 GMT
- Title: Heterogenous Ensemble of Models for Molecular Property Prediction
- Authors: Sajad Darabi, Shayan Fazeli, Jiwei Liu, Alexandre Milesi, Pawel
Morkisz, Jean-Fran\c{c}ois Puget, Gilberto Titericz
- Abstract summary: We propose a method for considering different modalities on molecules.
We ensemble these models with a HuberRegressor.
This yields a winning solution to the 2textsuperscriptnd edition of the OGB Large-Scale Challenge (2022)
- Score: 55.91865861896012
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Previous works have demonstrated the importance of considering different
modalities on molecules, each of which provide a varied granularity of
information for downstream property prediction tasks. Our method combines
variants of the recent TransformerM architecture with Transformer, GNN, and
ResNet backbone architectures. Models are trained on the 2D data, 3D data, and
image modalities of molecular graphs. We ensemble these models with a
HuberRegressor. The models are trained on 4 different train/validation splits
of the original train + valid datasets. This yields a winning solution to the
2\textsuperscript{nd} edition of the OGB Large-Scale Challenge (2022) on the
PCQM4Mv2 molecular property prediction dataset. Our proposed method achieves a
test-challenge MAE of $0.0723$ and a validation MAE of $0.07145$. Total
inference time for our solution is less than 2 hours. We open-source our code
at https://github.com/jfpuget/NVIDIA-PCQM4Mv2.
Related papers
- FissionFusion: Fast Geometric Generation and Hierarchical Souping for Medical Image Analysis [0.7751705157998379]
The scarcity of well-annotated medical datasets requires leveraging transfer learning from broader datasets like ImageNet or pre-trained models like CLIP.
Model soups averages multiple fine-tuned models aiming to improve performance on In-Domain (ID) tasks and enhance robustness against Out-of-Distribution (OOD) datasets.
We propose a hierarchical merging approach that involves local and global aggregation of models at various levels.
arXiv Detail & Related papers (2024-03-20T06:48:48Z) - PARSAC: Accelerating Robust Multi-Model Fitting with Parallel Sample
Consensus [26.366299016589256]
We present a real-time method for robust estimation of multiple instances of geometric models from noisy data.
A neural network segments the input data into clusters representing potential model instances.
We demonstrate state-of-the-art performance on these as well as multiple established datasets, with inference times as small as five milliseconds per image.
arXiv Detail & Related papers (2024-01-26T14:54:56Z) - Upgrading VAE Training With Unlimited Data Plans Provided by Diffusion
Models [12.542073306638988]
We show that overfitting encoders in VAEs can be effectively mitigated by training on samples from a pre-trained diffusion model.
We analyze generalization performance, amortization gap, and robustness of VAEs trained with our proposed method on three different data sets.
arXiv Detail & Related papers (2023-10-30T15:38:39Z) - Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis.
We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z) - Direct Molecular Conformation Generation [217.4815525740703]
We propose a method that directly predicts the coordinates of atoms.
Our method achieves state-of-the-art results on four public benchmarks.
arXiv Detail & Related papers (2022-02-03T01:01:58Z) - Datamodels: Predicting Predictions from Training Data [86.66720175866415]
We present a conceptual framework, datamodeling, for analyzing the behavior of a model class in terms of the training data.
We show that even simple linear datamodels can successfully predict model outputs.
arXiv Detail & Related papers (2022-02-01T18:15:24Z) - The Story in Your Eyes: An Individual-difference-aware Model for
Cross-person Gaze Estimation [24.833385815585405]
We propose a novel method on refining cross-person gaze prediction task with eye/face images only by explicitly modelling the person-specific differences.
Specifically, we first assume that we can obtain some initial gaze prediction results with existing method, which we refer to as InitNet.
We validate our ideas on three publicly available datasets, EVE, XGaze and MPIIGaze and demonstrate that our proposed method outperforms the SOTA methods significantly.
arXiv Detail & Related papers (2021-06-27T10:14:10Z) - Comparing Test Sets with Item Response Theory [53.755064720563]
We evaluate 29 datasets using predictions from 18 pretrained Transformer models on individual test examples.
We find that Quoref, HellaSwag, and MC-TACO are best suited for distinguishing among state-of-the-art models.
We also observe span selection task format, which is used for QA datasets like QAMR or SQuAD2.0, is effective in differentiating between strong and weak models.
arXiv Detail & Related papers (2021-06-01T22:33:53Z) - Knowledge Generation -- Variational Bayes on Knowledge Graphs [0.685316573653194]
This thesis is a proof of concept for potential of Vari Auto-Encoder (VAE) on representation of real-world Knowledge Graphs.
Inspired by successful approaches to generation graphs, we evaluate the capabilities of our model, the Variational Auto-Encoder (RGVAE)
The RGVAE is first evaluated on link prediction. The mean reciprocal rank (MRR) scores on the two FB15K-237 and WN18RR datasets are compared.
We investigate the latent space in a twofold experiment: first, linear between the latent representation of two triples, then the exploration of each
arXiv Detail & Related papers (2021-01-21T21:23:17Z) - The Right Tool for the Job: Matching Model and Instance Complexities [62.95183777679024]
As NLP models become larger, executing a trained model requires significant computational resources incurring monetary and environmental costs.
We propose a modification to contextual representation fine-tuning which, during inference, allows for an early (and fast) "exit"
We test our proposed modification on five different datasets in two tasks: three text classification datasets and two natural language inference benchmarks.
arXiv Detail & Related papers (2020-04-16T04:28:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.