MOFormer: Self-Supervised Transformer model for Metal-Organic Framework
Property Prediction
- URL: http://arxiv.org/abs/2210.14188v1
- Date: Tue, 25 Oct 2022 17:29:42 GMT
- Title: MOFormer: Self-Supervised Transformer model for Metal-Organic Framework
Property Prediction
- Authors: Zhonglin Cao, Rishikesh Magar, Yuyang Wang, and Amir Barati Farimani
- Abstract summary: Metal-Organic Frameworks (MOFs) are materials with a high degree of porosity that can be used for applications in energy storage, water desalination, gas storage, and gas separation.
Finding the optimal MOFs for specific applications requires an efficient and accurate search over an enormous number of potential candidates.
We propose a structure-agnostic deep learning method based on the Transformer model, named as MOFormer, for property predictions of MOFs.
- Score: 7.367477168940467
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Metal-Organic Frameworks (MOFs) are materials with a high degree of porosity
that can be used for applications in energy storage, water desalination, gas
storage, and gas separation. However, the chemical space of MOFs is close to an
infinite size due to the large variety of possible combinations of building
blocks and topology. Discovering the optimal MOFs for specific applications
requires an efficient and accurate search over an enormous number of potential
candidates. Previous high-throughput screening methods using computational
simulations like DFT can be time-consuming. Such methods also require
optimizing 3D atomic structure of MOFs, which adds one extra step when
evaluating hypothetical MOFs. In this work, we propose a structure-agnostic
deep learning method based on the Transformer model, named as MOFormer, for
property predictions of MOFs. The MOFormer takes a text string representation
of MOF (MOFid) as input, thus circumventing the need of obtaining the 3D
structure of hypothetical MOF and accelerating the screening process.
Furthermore, we introduce a self-supervised learning framework that pretrains
the MOFormer via maximizing the cross-correlation between its
structure-agnostic representations and structure-based representations of
crystal graph convolutional neural network (CGCNN) on >400k publicly available
MOF data. Using self-supervised learning allows the MOFormer to intrinsically
learn 3D structural information though it is not included in the input.
Experiments show that pretraining improved the prediction accuracy of both
models on various downstream prediction tasks. Furthermore, we revealed that
MOFormer can be more data-efficient on quantum-chemical property prediction
than structure-based CGCNN when training data is limited. Overall, MOFormer
provides a novel perspective on efficient MOF design using deep learning.
Related papers
- Pre-trained Molecular Language Models with Random Functional Group Masking [54.900360309677794]
We propose a SMILES-based underlineem Molecular underlineem Language underlineem Model, which randomly masking SMILES subsequences corresponding to specific molecular atoms.
This technique aims to compel the model to better infer molecular structures and properties, thus enhancing its predictive capabilities.
arXiv Detail & Related papers (2024-11-03T01:56:15Z) - Text-Guided Multi-Property Molecular Optimization with a Diffusion Language Model [77.50732023411811]
We propose a text-guided multi-property molecular optimization method utilizing transformer-based diffusion language model (TransDLM)
TransDLM leverages standardized chemical nomenclature as semantic representations of molecules and implicitly embeds property requirements into textual descriptions.
Our approach surpasses state-of-the-art methods in optimizing molecular structural similarity and enhancing chemical properties on the benchmark dataset.
arXiv Detail & Related papers (2024-10-17T14:30:27Z) - MOFFlow: Flow Matching for Structure Prediction of Metal-Organic Frameworks [42.61784133509237]
Metal-organic frameworks (MOFs) are a class of crystalline materials with promising applications in many areas such as carbon capture and drug delivery.
Existing approaches, including ab initio calculations and even deep generative models, struggle with the complexity of MOF structures due to the large number of atoms in the unit cells.
We introduce MOFFlow, the first deep generative model tailored for MOF structure prediction.
arXiv Detail & Related papers (2024-10-07T13:51:58Z) - Molecule Design by Latent Prompt Transformer [76.2112075557233]
This work explores the challenging problem of molecule design by framing it as a conditional generative modeling task.
We propose a novel generative model comprising three components: (1) a latent vector with a learnable prior distribution; (2) a molecule generation model based on a causal Transformer, which uses the latent vector as a prompt; and (3) a property prediction model that predicts a molecule's target properties and/or constraint values using the latent prompt.
arXiv Detail & Related papers (2024-02-27T03:33:23Z) - MOFDiff: Coarse-grained Diffusion for Metal-Organic Framework Design [4.819734936375677]
Metal-organic frameworks (MOFs) are of immense interest in applications such as gas storage and carbon capture.
We propose MOFDiff: a coarse-grained (CG) diffusion model that generates CG MOF structures.
We evaluate our model's capability to generate valid and novel MOF structures and its effectiveness in designing outstanding MOF materials.
arXiv Detail & Related papers (2023-10-16T18:00:15Z) - Molecular Geometry-aware Transformer for accurate 3D Atomic System
modeling [51.83761266429285]
We propose a novel Transformer architecture that takes nodes (atoms) and edges (bonds and nonbonding atom pairs) as inputs and models the interactions among them.
Moleformer achieves state-of-the-art on the initial state to relaxed energy prediction of OC20 and is very competitive in QM9 on predicting quantum chemical properties.
arXiv Detail & Related papers (2023-02-02T03:49:57Z) - Machine Learning model for gas-liquid interface reconstruction in CFD
numerical simulations [59.84561168501493]
The volume of fluid (VoF) method is widely used in multi-phase flow simulations to track and locate the interface between two immiscible fluids.
A major bottleneck of the VoF method is the interface reconstruction step due to its high computational cost and low accuracy on unstructured grids.
We propose a machine learning enhanced VoF method based on Graph Neural Networks (GNN) to accelerate the interface reconstruction on general unstructured meshes.
arXiv Detail & Related papers (2022-07-12T17:07:46Z) - Building Open Knowledge Graph for Metal-Organic Frameworks (MOF-KG):
Challenges and Case Studies [63.61566811532431]
Metal-Organic Frameworks (MOFs) have great potential to revolutionize applications such as gas storage, molecular separations, chemical sensing, crystalline and drug delivery.
The Cambridge Structural Database (CSD) reports 10,636 synthesized MOF crystals which in addition contains ca. 114,373 MOF-like structures.
In this demo paper, we describe our effort on leveraging knowledge graph methods to facilitate MOF prediction, discovery, and synthesis.
arXiv Detail & Related papers (2022-07-10T16:41:11Z) - Deep-learning-based prediction of nanoparticle phase transitions during
in situ transmission electron microscopy [3.613625739845355]
We train deep learning models to predict a sequence of future video frames based on the input of a sequence of previous frames.
This capability provides insight into size dependent structural changes in Au nanoparticles under dynamic reaction condition.
It may be possible to anticipate the next steps of a chemical reaction for emerging automated experimentation platforms.
arXiv Detail & Related papers (2022-05-23T15:50:24Z) - A Universal Framework for Featurization of Atomistic Systems [0.0]
Reactive force fields based on physics or machine learning can be used to bridge the gap in time and length scales.
We introduce the Gaussian multi-pole (GMP) featurization scheme that utilizes physically-relevant multi-pole expansions of the electron density around atoms.
We demonstrate that GMP-based models can achieve chemical accuracy for the QM9 dataset, and their accuracy remains reasonable even when extrapolating to new elements.
arXiv Detail & Related papers (2021-02-04T03:11:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.