ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular
Property Prediction
- URL: http://arxiv.org/abs/2010.09885v2
- Date: Fri, 23 Oct 2020 04:22:37 GMT
- Title: ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular
Property Prediction
- Authors: Seyone Chithrananda, Gabriel Grand and Bharath Ramsundar
- Abstract summary: In NLP, transformers have become the de-facto standard for representation learning thanks to their strong downstream task transfer.
We make one of the first attempts to systematically evaluate transformers on molecular property prediction tasks via our ChemBERTa model.
Our results suggest that transformers offer a promising avenue of future work for molecular representation learning and property prediction.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: GNNs and chemical fingerprints are the predominant approaches to representing
molecules for property prediction. However, in NLP, transformers have become
the de-facto standard for representation learning thanks to their strong
downstream task transfer. In parallel, the software ecosystem around
transformers is maturing rapidly, with libraries like HuggingFace and BertViz
enabling streamlined training and introspection. In this work, we make one of
the first attempts to systematically evaluate transformers on molecular
property prediction tasks via our ChemBERTa model. ChemBERTa scales well with
pretraining dataset size, offering competitive downstream performance on
MoleculeNet and useful attention-based visualization modalities. Our results
suggest that transformers offer a promising avenue of future work for molecular
representation learning and property prediction. To facilitate these efforts,
we release a curated dataset of 77M SMILES from PubChem suitable for
large-scale self-supervised pretraining.
Related papers
- Pre-trained Molecular Language Models with Random Functional Group Masking [54.900360309677794]
We propose a SMILES-based underlineem Molecular underlineem Language underlineem Model, which randomly masking SMILES subsequences corresponding to specific molecular atoms.
This technique aims to compel the model to better infer molecular structures and properties, thus enhancing its predictive capabilities.
arXiv Detail & Related papers (2024-11-03T01:56:15Z) - Transformers for molecular property prediction: Lessons learned from the past five years [0.0]
We analyze the currently available models and explore key questions that arise when training and fine-tuning a transformer model for MPP.
We address the challenges in comparing different models, emphasizing the need for standardized data splitting and robust statistical analysis.
arXiv Detail & Related papers (2024-04-05T09:05:37Z) - Transferring a molecular foundation model for polymer property
predictions [3.067983186439152]
Self-supervised pretraining of transformer models requires large-scale datasets.
We show that using transformers pretrained on small molecules and fine-tuned on polymer properties achieve comparable accuracy to those trained on augmented polymer datasets.
arXiv Detail & Related papers (2023-10-25T19:55:00Z) - Emergent Agentic Transformer from Chain of Hindsight Experience [96.56164427726203]
We show that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
This is the first time that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
arXiv Detail & Related papers (2023-05-26T00:43:02Z) - ChemBERTa-2: Towards Chemical Foundation Models [0.0]
We build a chemical foundation model, ChemBERTa-2, using the language of SMILES.
In this work, we build upon ChemBERTa by optimizing the pretraining process.
To our knowledge, the 77M set constitutes one of the largest datasets used for molecular pretraining to date.
arXiv Detail & Related papers (2022-09-05T00:31:12Z) - Pre-training Transformers for Molecular Property Prediction Using
Reaction Prediction [0.0]
Transfer learning has had a tremendous impact in fields like Computer Vision and Natural Language Processing.
We present a pre-training procedure for molecular representation learning using reaction data.
We show a statistically significant positive effect on 5 of the 12 tasks compared to a non-pre-trained baseline model.
arXiv Detail & Related papers (2022-07-06T14:51:38Z) - Improving Molecular Representation Learning with Metric
Learning-enhanced Optimal Transport [49.237577649802034]
We develop a novel optimal transport-based algorithm termed MROT to enhance their generalization capability for molecular regression problems.
MROT significantly outperforms state-of-the-art models, showing promising potential in accelerating the discovery of new substances.
arXiv Detail & Related papers (2022-02-13T04:56:18Z) - Transformers for prompt-level EMA non-response prediction [62.41658786277712]
Ecological Momentary Assessments (EMAs) are an important psychological data source for measuring cognitive states, affect, behavior, and environmental factors.
Non-response, in which participants fail to respond to EMA prompts, is an endemic problem.
The ability to accurately predict non-response could be utilized to improve EMA delivery and develop compliance interventions.
arXiv Detail & Related papers (2021-11-01T18:38:47Z) - Dual-view Molecule Pre-training [186.07333992384287]
Dual-view molecule pre-training can effectively combine the strengths of both types of molecule representations.
DMP is tested on nine molecular property prediction tasks and achieves state-of-the-art performances on seven of them.
arXiv Detail & Related papers (2021-06-17T03:58:38Z) - Self-Supervised Graph Transformer on Large-Scale Molecular Data [73.3448373618865]
We propose a novel framework, GROVER, for molecular representation learning.
GROVER can learn rich structural and semantic information of molecules from enormous unlabelled molecular data.
We pre-train GROVER with 100 million parameters on 10 million unlabelled molecules -- the biggest GNN and the largest training dataset in molecular representation learning.
arXiv Detail & Related papers (2020-06-18T08:37:04Z) - Molecule Attention Transformer [5.441166835871135]
We propose Molecule Attention Transformer (MAT) to design a single neural network architecture that performs competitively across a range of molecule property prediction tasks.
Our key innovation is to augment the attention mechanism in Transformer using inter-atomic distances and the molecular graph structure.
arXiv Detail & Related papers (2020-02-19T16:14:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.