MolSets: Molecular Graph Deep Sets Learning for Mixture Property
Modeling
- URL: http://arxiv.org/abs/2312.16473v1
- Date: Wed, 27 Dec 2023 08:46:14 GMT
- Title: MolSets: Molecular Graph Deep Sets Learning for Mixture Property
Modeling
- Authors: Hengrui Zhang, Jie Chen, James M. Rondinelli, Wei Chen
- Abstract summary: We present MolSets, a specialized machine learning model for molecular mixtures.
Representing individual molecules as graphs and their mixture as a set, MolSets aggregate it at the mixture level.
We demonstrate the efficacy of MolSets in predicting the conductivity of lithium battery electrolytes and highlight its benefits in virtual screening of the chemical space.
- Score: 14.067533753010897
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in machine learning (ML) have expedited materials discovery
and design. One significant challenge faced in ML for materials is the
expansive combinatorial space of potential materials formed by diverse
constituents and their flexible configurations. This complexity is particularly
evident in molecular mixtures, a frequently explored space for materials such
as battery electrolytes. Owing to the complex structures of molecules and the
sequence-independent nature of mixtures, conventional ML methods have
difficulties in modeling such systems. Here we present MolSets, a specialized
ML model for molecular mixtures. Representing individual molecules as graphs
and their mixture as a set, MolSets leverages a graph neural network and the
deep sets architecture to extract information at the molecule level and
aggregate it at the mixture level, thus addressing local complexity while
retaining global flexibility. We demonstrate the efficacy of MolSets in
predicting the conductivity of lithium battery electrolytes and highlight its
benefits in virtual screening of the combinatorial chemical space.
Related papers
- DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra [60.39311767532607]
DiffMS is a formula-restricted encoder-decoder generative network.
We develop a robust decoder that bridges latent embeddings and molecular structures.
Experiments show DiffMS outperforms existing models on $textitde novo$ molecule generation.
arXiv Detail & Related papers (2025-02-13T18:29:48Z) - SE3Set: Harnessing equivariant hypergraph neural networks for molecular representation learning [27.713870291922333]
We develop an SE(3) equivariant hypergraph neural network architecture tailored for advanced molecular representation learning.
SE3Set has shown performance on par with state-of-the-art (SOTA) models for small molecule datasets.
It excels on the MD22 dataset, achieving a notable improvement of approximately 20% in accuracy across all molecules.
arXiv Detail & Related papers (2024-05-26T10:43:16Z) - Interpolation and differentiation of alchemical degrees of freedom in machine learning interatomic potentials [0.980222898148295]
We report the use of continuous and differentiable alchemical degrees of freedom in atomistic materials simulations.
The proposed method introduces alchemical atoms with corresponding weights into the input graph, alongside modifications to the message-passing and readout mechanisms of MLIPs.
The end-to-end differentiability of MLIPs enables efficient calculation of the gradient of energy with respect to the compositional weights.
arXiv Detail & Related papers (2024-04-16T17:24:22Z) - Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model [49.64512917330373]
We introduce a multi-constraint molecular generation large language model, TSMMG, akin to a student.
To train TSMMG, we construct a large set of text-molecule pairs by extracting molecular knowledge from these 'teachers'
We experimentally show that TSMMG remarkably performs in generating molecules meeting complex, natural language-described property requirements.
arXiv Detail & Related papers (2024-03-20T02:15:55Z) - Enhanced sampling of robust molecular datasets with uncertainty-based
collective variables [0.0]
We propose a method that leverages uncertainty as the collective variable (CV) to guide the acquisition of chemically-relevant data points.
This approach employs a Gaussian Mixture Model-based uncertainty metric from a single model as the CV for biased molecular dynamics simulations.
arXiv Detail & Related papers (2024-02-06T06:42:51Z) - LLamol: A Dynamic Multi-Conditional Generative Transformer for De Novo
Molecular Design [0.0]
"LLamol" is a single novel generative transformer model based on the LLama 2 architecture.
We demonstrate that the model adeptly handles single- and multi-conditional organic molecule generation with up to four conditions.
In detail, we showcase the model's capability to utilize token sequences for conditioning, either individually or in combination with numerical properties.
arXiv Detail & Related papers (2023-11-24T10:59:12Z) - Multi-channel learning for integrating structural hierarchies into context-dependent molecular representation [10.025809630976065]
This paper introduces a novel pre-training framework that learns robust and generalizable chemical knowledge.
Our approach demonstrates competitive performance across various molecular property benchmarks.
arXiv Detail & Related papers (2023-11-05T23:47:52Z) - Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular
Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction.
Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations.
On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z) - Accurate Machine Learned Quantum-Mechanical Force Fields for
Biomolecular Simulations [51.68332623405432]
Molecular dynamics (MD) simulations allow atomistic insights into chemical and biological processes.
Recently, machine learned force fields (MLFFs) emerged as an alternative means to execute MD simulations.
This work proposes a general approach to constructing accurate MLFFs for large-scale molecular simulations.
arXiv Detail & Related papers (2022-05-17T13:08:28Z) - Federated Learning of Molecular Properties in a Heterogeneous Setting [79.00211946597845]
We introduce federated heterogeneous molecular learning to address these challenges.
Federated learning allows end-users to build a global model collaboratively while preserving the training data distributed over isolated clients.
FedChem should enable a new type of collaboration for improving AI in chemistry that mitigates concerns about valuable chemical data.
arXiv Detail & Related papers (2021-09-15T12:49:13Z) - BIGDML: Towards Exact Machine Learning Force Fields for Materials [55.944221055171276]
Machine-learning force fields (MLFF) should be accurate, computationally and data efficient, and applicable to molecules, materials, and interfaces thereof.
Here, we introduce the Bravais-Inspired Gradient-Domain Machine Learning approach and demonstrate its ability to construct reliable force fields using a training set with just 10-200 atoms.
arXiv Detail & Related papers (2021-06-08T10:14:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.