Stoichiometry Representation Learning with Polymorphic Crystal
Structures
- URL: http://arxiv.org/abs/2312.13289v1
- Date: Fri, 17 Nov 2023 20:34:28 GMT
- Title: Stoichiometry Representation Learning with Polymorphic Crystal
Structures
- Authors: Namkyeong Lee, Heewoong Noh, Gyoung S. Na, Tianfan Fu, Jimeng Sun,
Chanyoung Park
- Abstract summary: Stoichiometry descriptors can reveal the ratio between elements involved to form a certain compound without any structural information.
We propose PolySRL, which learns the probabilistic representation of stoichiometry by utilizing the readily available structural information.
- Score: 54.65985356122883
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the recent success of machine learning (ML) in materials science, its
success heavily relies on the structural description of crystal, which is
itself computationally demanding and occasionally unattainable. Stoichiometry
descriptors can be an alternative approach, which reveals the ratio between
elements involved to form a certain compound without any structural
information. However, it is not trivial to learn the representations of
stoichiometry due to the nature of materials science called polymorphism, i.e.,
a single stoichiometry can exist in multiple structural forms due to the
flexibility of atomic arrangements, inducing uncertainties in representation.
To this end, we propose PolySRL, which learns the probabilistic representation
of stoichiometry by utilizing the readily available structural information,
whose uncertainty reveals the polymorphic structures of stoichiometry.
Extensive experiments on sixteen datasets demonstrate the superiority of
PolySRL, and analysis of uncertainties shed light on the applicability of
PolySRL in real-world material discovery. The source code for PolySRL is
available at https://github.com/Namkyeong/PolySRL_AI4Science.
Related papers
- Defining Neural Network Architecture through Polytope Structures of Dataset [53.512432492636236]
This paper defines upper and lower bounds for neural network widths, which are informed by the polytope structure of the dataset in question.
We develop an algorithm to investigate a converse situation where the polytope structure of a dataset can be inferred from its corresponding trained neural networks.
It is established that popular datasets such as MNIST, Fashion-MNIST, and CIFAR10 can be efficiently encapsulated using no more than two polytopes with a small number of faces.
arXiv Detail & Related papers (2024-02-04T08:57:42Z) - Towards End-to-End Structure Solutions from Information-Compromised
Diffraction Data via Generative Deep Learning [6.617784410952713]
Machine learning (ML) and deep learning (DL) are promising approaches since they augment information in the degraded input signal with prior knowledge learned from large databases of already known structures.
Here we present a novel ML approach, a variational query-based multi-branch deep neural network that has the promise to be a robust but general tool to address this problem end-to-end.
The system achieves up to $93.4%$ average similarity with the ground truth on unseen materials, both with known and partially-known chemical composition information.
arXiv Detail & Related papers (2023-12-23T02:17:27Z) - Compositional Deep Probabilistic Models of DNA Encoded Libraries [6.206196935093064]
We introduce a compositional deep probabilistic model of DEL data, DEL-Compose, which decomposes molecular representations into their mono-synthon, di-synthon, and tri-synthon building blocks.
Our model demonstrates strong performance compared to count baselines, enriches the correct pharmacophores, and offers valuable insights via its intrinsic interpretable structure.
arXiv Detail & Related papers (2023-10-20T19:04:28Z) - Scalable Diffusion for Materials Generation [99.71001883652211]
We develop a unified crystal representation that can represent any crystal structure (UniMat)
UniMat can generate high fidelity crystal structures from larger and more complex chemical systems.
We propose additional metrics for evaluating generative models of materials.
arXiv Detail & Related papers (2023-10-18T15:49:39Z) - Pair-Variational Autoencoders (PairVAE) for Linking and
Cross-Reconstruction of Characterization Data from Complementary Structural
Characterization Techniques [0.0]
In material research, structural characterization often requires multiple complementary techniques to obtain a holistic morphological view of the synthesized material.
It is useful to have machine learning models that can be trained on paired structural characterization data from multiple techniques so that the model can generate one set of characterization data from the other.
In this paper we demonstrate one such machine learning workflow, PairVAE, that works with data from Small Angle X-Ray Scattering (SAXS) that presents information about bulk morphology and images from Scanning Electron Microscopy (SEM) that presents two-dimensional local structural information of the sample.
arXiv Detail & Related papers (2023-05-25T20:45:36Z) - Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular
Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction.
Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations.
On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z) - Structured information extraction from complex scientific text with
fine-tuned large language models [55.96705756327738]
We present a simple sequence-to-sequence approach to joint named entity recognition and relation extraction.
The approach leverages a pre-trained large language model (LLM), GPT-3, that is fine-tuned on approximately 500 pairs of prompts.
This approach represents a simple, accessible, and highly-flexible route to obtaining large databases of structured knowledge extracted from unstructured text.
arXiv Detail & Related papers (2022-12-10T07:51:52Z) - A data-driven interpretation of the stability of molecular crystals [0.0]
Predicting the stability of crystal structures formed from molecular building blocks is a non-trivial scientific problem.
We introduce a structural descriptor tailored to the prediction of the binding energy for a curated dataset of organic crystals.
We then interpret this library using a low-dimensional representation of the structure-energy landscape.
arXiv Detail & Related papers (2022-09-21T23:32:53Z) - PAC Reinforcement Learning for Predictive State Representations [60.00237613646686]
We study online Reinforcement Learning (RL) in partially observable dynamical systems.
We focus on the Predictive State Representations (PSRs) model, which is an expressive model that captures other well-known models.
We develop a novel model-based algorithm for PSRs that can learn a near optimal policy in sample complexity scalingly.
arXiv Detail & Related papers (2022-07-12T17:57:17Z) - Structure-Property Maps with Kernel Principal Covariates Regression [0.0]
We introduce a kernelized version of PCovR and a sparsified extension, and demonstrate the performance of this approach in revealing and predicting structure-property relations.
We show examples of elemental carbon, porous silicate frameworks, organic molecules, amino acid conformers, and molecular materials.
arXiv Detail & Related papers (2020-02-12T16:29:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.