Ensemble Model With Bert,Roberta and Xlnet For Molecular property prediction
- URL: http://arxiv.org/abs/2406.06553v1
- Date: Thu, 30 May 2024 10:03:58 GMT
- Title: Ensemble Model With Bert,Roberta and Xlnet For Molecular property prediction
- Authors: Junling Hu,
- Abstract summary: This paper presents a novel approach for predicting molecular properties with high accuracy without the need for extensive pre-training.
We employ ensemble learning and supervised fine-tuning of BERT, RoBERTa, and XLNet.
This innovation provides a cost-effective and resource-efficient solution, potentially advancing further research in the molecular domain.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a novel approach for predicting molecular properties with high accuracy without the need for extensive pre-training. Employing ensemble learning and supervised fine-tuning of BERT, RoBERTa, and XLNet, our method demonstrates significant effectiveness compared to existing advanced models. Crucially, it addresses the issue of limited computational resources faced by experimental groups, enabling them to accurately predict molecular properties. This innovation provides a cost-effective and resource-efficient solution, potentially advancing further research in the molecular domain.
Related papers
- Cross-Modal Learning for Chemistry Property Prediction: Large Language Models Meet Graph Machine Learning [0.0]
We introduce a Multi-Modal Fusion (MMF) framework that harnesses the analytical prowess of Graph Neural Networks (GNNs) and the linguistic generative and predictive abilities of Large Language Models (LLMs)
Our framework combines the effectiveness of GNNs in modeling graph-structured data with the zero-shot and few-shot learning capabilities of LLMs, enabling improved predictions while reducing the risk of overfitting.
arXiv Detail & Related papers (2024-08-27T11:10:39Z) - Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model [49.64512917330373]
We introduce a multi-constraint molecular generation large language model, TSMMG, akin to a student.
To train TSMMG, we construct a large set of text-molecule pairs by extracting molecular knowledge from these 'teachers'
We experimentally show that TSMMG remarkably performs in generating molecules meeting complex, natural language-described property requirements.
arXiv Detail & Related papers (2024-03-20T02:15:55Z) - TwinBooster: Synergising Large Language Models with Barlow Twins and
Gradient Boosting for Enhanced Molecular Property Prediction [0.0]
In this study, we use a fine-tuned large language model to integrate biological assays based on their textual information.
This architecture uses both assay information and molecular fingerprints to extract the true molecular information.
TwinBooster enables the prediction of properties of unseen bioassays and molecules by providing state-of-the-art zero-shot learning tasks.
arXiv Detail & Related papers (2024-01-09T10:36:20Z) - In-Context Learning for Few-Shot Molecular Property Prediction [56.67309268480843]
In this paper, we adapt the concepts underpinning in-context learning to develop a new algorithm for few-shot molecular property prediction.
Our approach learns to predict molecular properties from a context of (molecule, property measurement) pairs and rapidly adapts to new properties without fine-tuning.
arXiv Detail & Related papers (2023-10-13T05:12:48Z) - Machine Learning Small Molecule Properties in Drug Discovery [44.62264781248437]
We review a wide range of properties, including binding affinities, solubility, and ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity)
We discuss existing popular descriptors and embeddings, such as chemical fingerprints and graph-based neural networks.
Finally, techniques to provide an understanding of model predictions, especially for critical decision-making in drug discovery are assessed.
arXiv Detail & Related papers (2023-08-02T22:18:41Z) - Retrieval-based Controllable Molecule Generation [63.44583084888342]
We propose a new retrieval-based framework for controllable molecule generation.
We use a small set of molecules to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria.
Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning.
arXiv Detail & Related papers (2022-08-23T17:01:16Z) - KPGT: Knowledge-Guided Pre-training of Graph Transformer for Molecular
Property Prediction [13.55018269009361]
We introduce Knowledge-guided Pre-training of Graph Transformer (KPGT), a novel self-supervised learning framework for molecular graph representation learning.
KPGT can offer superior performance over current state-of-the-art methods on several molecular property prediction tasks.
arXiv Detail & Related papers (2022-06-02T08:22:14Z) - Molecular Attributes Transfer from Non-Parallel Data [57.010952598634944]
We formulate molecular optimization as a style transfer problem and present a novel generative model that could automatically learn internal differences between two groups of non-parallel data.
Experiments on two molecular optimization tasks, toxicity modification and synthesizability improvement, demonstrate that our model significantly outperforms several state-of-the-art methods.
arXiv Detail & Related papers (2021-11-30T06:10:22Z) - Few-Shot Graph Learning for Molecular Property Prediction [46.60746023179724]
We propose Meta-MGNN, a novel model for few-shot molecular property prediction.
To exploit unlabeled molecular information, Meta-MGNN further incorporates molecular structure, attribute based self-supervised modules and self-attentive task weights.
Extensive experiments on two public multi-property datasets demonstrate that Meta-MGNN outperforms a variety of state-of-the-art methods.
arXiv Detail & Related papers (2021-02-16T01:55:34Z) - Advanced Graph and Sequence Neural Networks for Molecular Property
Prediction and Drug Discovery [53.00288162642151]
We develop MoleculeKit, a suite of comprehensive machine learning tools spanning different computational models and molecular representations.
Built on these representations, MoleculeKit includes both deep learning and traditional machine learning methods for graph and sequence data.
Results on both online and offline antibiotics discovery and molecular property prediction tasks show that MoleculeKit achieves consistent improvements over prior methods.
arXiv Detail & Related papers (2020-12-02T02:09:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.