deCIFer: Crystal Structure Prediction from Powder Diffraction Data using Autoregressive Language Models
- URL: http://arxiv.org/abs/2502.02189v2
- Date: Mon, 10 Feb 2025 08:39:50 GMT
- Title: deCIFer: Crystal Structure Prediction from Powder Diffraction Data using Autoregressive Language Models
- Authors: Frederik Lizak Johansen, Ulrik Friis-Jensen, Erik Bjørnager Dam, Kirsten Marie Ørnsbjerg Jensen, Rocío Mercado, Raghavendra Selvan,
- Abstract summary: We introduce an autoregressive language model that performs crystal structure prediction (CSP) from powder diffraction data.
The presented model, deCIFer, generates crystal structures in the widely used Crystallographic Information File (CIF) format.
We train deCIFer on nearly 2.3M unique crystal structures and validate on diverse sets of PXRD patterns for characterizing challenging inorganic crystal systems.
- Score: 1.231476564107544
- License:
- Abstract: Novel materials drive progress across applications from energy storage to electronics. Automated characterization of material structures with machine learning methods offers a promising strategy for accelerating this key step in material design. In this work, we introduce an autoregressive language model that performs crystal structure prediction (CSP) from powder diffraction data. The presented model, deCIFer, generates crystal structures in the widely used Crystallographic Information File (CIF) format and can be conditioned on powder X-ray diffraction (PXRD) data. Unlike earlier works that primarily rely on high-level descriptors like composition, deCIFer performs CSP from diffraction data. We train deCIFer on nearly 2.3M unique crystal structures and validate on diverse sets of PXRD patterns for characterizing challenging inorganic crystal systems. Qualitative and quantitative assessments using the residual weighted profile and Wasserstein distance show that deCIFer produces structures that more accurately match the target diffraction data when conditioned, compared to the unconditioned case. Notably, deCIFer can achieve a 94% match rate on unseen data. deCIFer bridges experimental diffraction data with computational CSP, lending itself as a powerful tool for crystal structure characterization and accelerating materials discovery.
Related papers
- Ab Initio Structure Solutions from Nanocrystalline Powder Diffraction Data [4.463003012243322]
A major challenge in materials science is the determination of the structure of nanometer sized objects.
We present a novel approach that uses a generative machine learning model based on diffusion processes that is trained on 45,229 known structures.
We find that our model, PXRDnet, can successfully solve simulated nanocrystals as small as 10 angstroms across 200 materials of varying symmetry and complexity.
arXiv Detail & Related papers (2024-06-16T03:45:03Z) - Compositional Representation of Polymorphic Crystalline Materials [56.80318252233511]
We introduce PCRL, a novel approach that employs probabilistic modeling of composition to capture the diverse polymorphs from available structural information.
Extensive evaluations on sixteen datasets demonstrate the effectiveness of PCRL in learning compositional representation.
arXiv Detail & Related papers (2023-11-17T20:34:28Z) - Scalable Diffusion for Materials Generation [99.71001883652211]
We develop a unified crystal representation that can represent any crystal structure (UniMat)
UniMat can generate high fidelity crystal structures from larger and more complex chemical systems.
We propose additional metrics for evaluating generative models of materials.
arXiv Detail & Related papers (2023-10-18T15:49:39Z) - Latent Conservative Objective Models for Data-Driven Crystal Structure
Prediction [62.36797874900395]
In computational chemistry, crystal structure prediction is an optimization problem.
One approach to tackle this problem involves building simulators based on density functional theory (DFT) followed by running search in simulation.
We show that our approach, dubbed LCOMs (latent conservative objective models), performs comparably to the best current approaches in terms of success rate of structure prediction.
arXiv Detail & Related papers (2023-10-16T04:35:44Z) - Data-Driven Score-Based Models for Generating Stable Structures with
Adaptive Crystal Cells [1.515687944002438]
This work aims at the generation of new crystal structures with desired properties, such as chemical stability and specified chemical composition.
The novelty of the presented approach resides in the fact that the lattice of the crystal cell is not fixed.
A multigraph crystal representation is introduced that respects symmetry constraints, yielding computational advantages.
arXiv Detail & Related papers (2023-10-16T02:53:24Z) - Crystal-GFN: sampling crystals with desirable properties and constraints [103.79058968784163]
We introduce Crystal-GFN, a generative model of crystal structures that sequentially samples structural properties of crystalline materials.
In this paper, we use as objective the formation energy per atom of a crystal structure predicted by a new proxy machine learning model trained on MatBench.
The results demonstrate that Crystal-GFN is able to sample highly diverse crystals with low (median -3.1 eV/atom) predicted formation energy.
arXiv Detail & Related papers (2023-10-07T21:36:55Z) - Crystal Structure Prediction by Joint Equivariant Diffusion [27.52168842448489]
Crystal Structure Prediction (CSP) is crucial in various scientific disciplines.
This paper proposes DiffCSP, a novel diffusion model to learn the structure distribution from stable crystals.
arXiv Detail & Related papers (2023-07-30T15:46:33Z) - Neural networks trained on synthetically generated crystals can extract
structural information from ICSD powder X-ray diffractograms [0.6906005491572401]
Machine learning techniques have successfully been used to extract structural information from powder X-ray diffractograms.
We propose an alternative approach of generating synthetic crystals with random coordinates by using the symmetry operations of each space group.
We demonstrate online training of deep ResNet-like models on up to a few million unique on-the-fly generated synthetic diffractograms per hour.
arXiv Detail & Related papers (2023-03-21T09:37:29Z) - Tracking perovskite crystallization via deep learning-based feature
detection on 2D X-ray scattering data [137.47124933818066]
We propose an automated pipeline for the analysis of X-ray diffraction images based on the Faster R-CNN deep learning architecture.
We demonstrate our method on real-time tracking of organic-inorganic perovskite structure crystallization and test it on two applications.
arXiv Detail & Related papers (2022-02-22T15:39:00Z) - Disentangling multiple scattering with deep learning: application to
strain mapping from electron diffraction patterns [48.53244254413104]
We implement a deep neural network called FCU-Net to invert highly nonlinear electron diffraction patterns into quantitative structure factor images.
We trained the FCU-Net using over 200,000 unique dynamical diffraction patterns which include many different combinations of crystal structures.
Our simulated diffraction pattern library, implementation of FCU-Net, and trained model weights are freely available in open source repositories.
arXiv Detail & Related papers (2022-02-01T03:53:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.