Beyond Chemical Language: A Multimodal Approach to Enhance Molecular
Property Prediction
- URL: http://arxiv.org/abs/2306.14919v1
- Date: Thu, 22 Jun 2023 13:28:59 GMT
- Title: Beyond Chemical Language: A Multimodal Approach to Enhance Molecular
Property Prediction
- Authors: Eduardo Soares, Emilio Vital Brazil, Karen Fiorela Aquino Gutierrez,
Renato Cerqueira, Dan Sanders, Kristin Schmidt, Dmitry Zubarev
- Abstract summary: We present a novel multimodal language model approach for predicting molecular properties by combining chemical language representation with physicochemical features.
Our approach, MULTIMODAL-MOLFORMER, utilizes a causal multistage feature selection method that identifies physicochemical features based on their direct causal effect on a specific target property.
Our results demonstrate a superior performance compared to existing state-of-the-art algorithms, including the chemical language-based MOLFORMER and graph neural networks.
- Score: 2.1202329976106924
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel multimodal language model approach for predicting
molecular properties by combining chemical language representation with
physicochemical features. Our approach, MULTIMODAL-MOLFORMER, utilizes a causal
multistage feature selection method that identifies physicochemical features
based on their direct causal effect on a specific target property. These causal
features are then integrated with the vector space generated by molecular
embeddings from MOLFORMER. In particular, we employ Mordred descriptors as
physicochemical features and identify the Markov blanket of the target
property, which theoretically contains the most relevant features for accurate
prediction. Our results demonstrate a superior performance of our proposed
approach compared to existing state-of-the-art algorithms, including the
chemical language-based MOLFORMER and graph neural networks, in predicting
complex tasks such as biodegradability and PFAS toxicity estimation. Moreover,
we demonstrate the effectiveness of our feature selection method in reducing
the dimensionality of the Mordred feature space while maintaining or improving
the model's performance. Our approach opens up promising avenues for future
research in molecular property prediction by harnessing the synergistic
potential of both chemical language and physicochemical features, leading to
enhanced performance and advancements in the field.
Related papers
- Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization [147.7899503829411]
We propose a novel and general alignment framework to align pretrained target diffusion models with preferred functional properties, named AliDiff.
AliDiff shifts the target-conditioned chemical distribution towards regions with higher binding affinity and structural rationality, specified by user-defined reward functions.
We show that AliDiff can generate molecules with state-of-the-art binding energies with up to -7.07 Avg. Vina Score, while maintaining strong molecular properties.
arXiv Detail & Related papers (2024-07-01T06:10:29Z) - A Gaussian Process Model for Ordinal Data with Applications to Chemoinformatics [0.0]
We present conditional Gaussian process models to predict ordinal outcomes from chemical experiments.
A novel aspect of our model is that the kernel contains a scaling parameter, that controls the strength of the correlation between elements of the chemical space.
Using molecular fingerprints, a numerical representation of a compound's location within the chemical space, we show that accounting for correlation amongst chemical compounds improves predictive performance.
arXiv Detail & Related papers (2024-05-16T11:18:32Z) - Contrastive Dual-Interaction Graph Neural Network for Molecular Property Prediction [0.0]
We introduce DIG-Mol, a novel self-supervised graph neural network framework for molecular property prediction.
DIG-Mol integrates a momentum distillation network with two interconnected networks to efficiently improve molecular characterization.
We have established DIG-Mol's state-of-the-art performance through extensive experimental evaluation in a variety of molecular property prediction tasks.
arXiv Detail & Related papers (2024-05-04T10:09:27Z) - Active Causal Learning for Decoding Chemical Complexities with Targeted Interventions [0.0]
We introduce an active learning approach that discerns underlying cause-effect relationships through strategic sampling.
This method identifies the smallest subset of the dataset capable of encoding the most information representative of a much larger chemical space.
The identified causal relations are then leveraged to conduct systematic interventions, optimizing the design task within a chemical space that the models have not encountered previously.
arXiv Detail & Related papers (2024-04-05T17:15:48Z) - Improving Molecular Properties Prediction Through Latent Space Fusion [9.912768918657354]
We present a multi-view approach that combines latent spaces derived from state-of-the-art chemical models.
Our approach relies on two pivotal elements: the embeddings derived from MHG-GNN, which represent molecular structures as graphs, and MoLFormer embeddings rooted in chemical language.
We demonstrate the superior performance of our proposed multi-view approach compared to existing state-of-the-art methods.
arXiv Detail & Related papers (2023-10-20T20:29:32Z) - Molecule Design by Latent Space Energy-Based Modeling and Gradual
Distribution Shifting [53.44684898432997]
Generation of molecules with desired chemical and biological properties is critical for drug discovery.
We propose a probabilistic generative model to capture the joint distribution of molecules and their properties.
Our method achieves very strong performances on various molecule design tasks.
arXiv Detail & Related papers (2023-06-09T03:04:21Z) - Atomic and Subgraph-aware Bilateral Aggregation for Molecular
Representation Learning [57.670845619155195]
We introduce a new model for molecular representation learning called the Atomic and Subgraph-aware Bilateral Aggregation (ASBA)
ASBA addresses the limitations of previous atom-wise and subgraph-wise models by incorporating both types of information.
Our method offers a more comprehensive way to learn representations for molecular property prediction and has broad potential in drug and material discovery applications.
arXiv Detail & Related papers (2023-05-22T00:56:00Z) - Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular
Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction.
Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations.
On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z) - Flexible dual-branched message passing neural network for quantum
mechanical property prediction with molecular conformation [16.08677447593939]
We propose a dual-branched neural network for molecular property prediction based on message-passing framework.
Our model learns heterogeneous molecular features with different scales, which are trained flexibly according to each prediction target.
arXiv Detail & Related papers (2021-06-14T10:00:39Z) - Reinforced Molecular Optimization with Neighborhood-Controlled Grammars [63.84003497770347]
We propose MNCE-RL, a graph convolutional policy network for molecular optimization.
We extend the original neighborhood-controlled embedding grammars to make them applicable to molecular graph generation.
We show that our approach achieves state-of-the-art performance in a diverse range of molecular optimization tasks.
arXiv Detail & Related papers (2020-11-14T05:42:15Z) - Optimizing Molecules using Efficient Queries from Property Evaluations [66.66290256377376]
We propose QMO, a generic query-based molecule optimization framework.
QMO improves the desired properties of an input molecule based on efficient queries.
We show that QMO outperforms existing methods in the benchmark tasks of optimizing small organic molecules.
arXiv Detail & Related papers (2020-11-03T18:51:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.