Machine Learning-Driven Crystal System Prediction for Perovskites Using Augmented X-ray Diffraction Data
- URL: http://arxiv.org/abs/2602.04435v1
- Date: Wed, 04 Feb 2026 11:09:51 GMT
- Title: Machine Learning-Driven Crystal System Prediction for Perovskites Using Augmented X-ray Diffraction Data
- Authors: Ansu Mathew, Ahmer A. B. Baloch, Alamin Yakasai, Hemant Mittal, Vivian Alberts, Jayakumar V. Karunamurthy,
- Abstract summary: Prediction of crystal system from X-ray diffraction (XRD) spectra is a critical task in materials science.<n>In this study, we present a machine learning (ML)-driven framework that leverages advanced models.<n>The model demonstrated high performance for symmetry classes, including cubic crystal systems, point groups 3m and m-3m, and space groups Pnma and Pnnn.
- Score: 0.10262304700896197
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Prediction of crystal system from X-ray diffraction (XRD) spectra is a critical task in materials science, particularly for perovskite materials which are known for their diverse applications in photovoltaics, optoelectronics, and catalysis. In this study, we present a machine learning (ML)-driven framework that leverages advanced models, including Time Series Forest (TSF), Random Forest (RF), Extreme Gradient Boosting (XGBoost), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and a simple feedforward neural network (NN), to classify crystal systems, point groups, and space groups from XRD data of perovskite materials. To address class imbalance and enhance model robustness, we integrated feature augmentation strategies such as Synthetic Minority Over-sampling Technique (SMOTE), class weighting, jittering, and spectrum shifting, along with efficient data preprocessing pipelines. The TSF model with SMOTE augmentation achieved strong performance for crystal system prediction, with a Matthews correlation coefficient (MCC) of 0.9, an F1 score of 0.92, and an accuracy of 97.76%. For point and space group prediction, balanced accuracies above 95% were obtained. The model demonstrated high performance for symmetry-distinct classes, including cubic crystal systems, point groups 3m and m-3m, and space groups Pnma and Pnnn. This work highlights the potential of ML for XRD-based structural characterization and accelerated discovery of perovskite materials
Related papers
- From Static Spectra to Operando Infrared Dynamics: Physics Informed Flow Modeling and a Benchmark [67.29937933325849]
Operando IR Prediction aims to forecast the time-resolved evolution of spectral fingerprints'' from a single static spectrum.<n>OpIRSpec-7K comprises 7,118 high-quality samples across 10 distinct battery systems.<n>ABCC significantly outperforms state-of-the-art static, sequential, and generative baselines.
arXiv Detail & Related papers (2026-02-20T18:58:43Z) - OXtal: An All-Atom Diffusion Model for Organic Crystal Structure Prediction [63.318434943975255]
We introduce OXtal, a large-scale 100M parameter all-atom diffusion model that learns the conditional joint distribution over intramolecular conformations and periodic packing.<n>By leveraging a large dataset of 600K experimentally validated crystal structures, OXtal achieves orders-of-improvement over prior ab initio machine learning CSP methods.<n> OXtal attains over 80% packing similarity rate, demonstrating its ability to model both thermodynamic and kinetic regularities of molecular crystallization.
arXiv Detail & Related papers (2025-12-07T20:46:30Z) - XxaCT-NN: Structure Agnostic Multimodal Learning for Materials Science [0.27185251060695437]
We propose a scalable framework that learns directly from elemental composition and X-ray diffraction (XRD)<n>Our architecture integrates modality-specific encoders with a cross-attention fusion module and is trained on the 5-million-sample Alexandria dataset.<n>Our results establish a path toward structure-free, experimentally grounded foundation models for materials science.
arXiv Detail & Related papers (2025-06-27T21:45:56Z) - Latent-space Field Tension for Astrophysical Component Detection An application to X-ray imaging [0.0]
We introduce a novel multi-frequency Bayesian model of the sky emission field that leverages latent-space tension as an indicator of model misspecification.<n>We demonstrate the effectiveness of this method on synthetic multi-frequency imaging data and apply it to observational X-ray data from the eROSITA Early Data Release (EDR) of the SN1987A region in the Large Magellanic Cloud (LMC)<n>Our results highlight the method's capability to reconstruct astrophysical components with high accuracy, achieving sub-pixel localization of point sources, robust separation of extended emission, and detailed uncertainty quantification.
arXiv Detail & Related papers (2025-06-25T18:45:18Z) - Learning to Dissipate Energy in Oscillatory State-Space Models [51.98491034847041]
State-space models (SSMs) are a class of networks for sequence learning.<n>We show that D-LinOSS consistently outperforms previous LinOSS methods on long-range learning tasks.
arXiv Detail & Related papers (2025-05-17T23:15:17Z) - Towards Space Group Determination from EBSD Patterns: The Role of Deep Learning and High-throughput Dynamical Simulations [0.7154115167845776]
Deep learning methods may be able to classify the space group symmetries using the patterns as input.<n>Neural networks were trained to predict the space group type of background corrected EBSD patterns.<n>We introduce a relabeling scheme, which enables our models to achieve accuracy scores higher than 90% on simulated and experimental data.
arXiv Detail & Related papers (2025-04-30T05:36:31Z) - OmniXAS: A Universal Deep-Learning Framework for Materials X-ray Absorption Spectra [0.6291443816903801]
X-ray absorption spectroscopy (XAS) is a powerful characterization technique for probing the local chemical environment of absorbing atoms.<n>We present a framework that contains a suite of transfer learning approaches for XAS prediction, each contributing to improved accuracy and efficiency.<n>Our approach boosts the throughput of XAS modeling by orders of magnitude versus first-principles simulations and is extendable to XAS prediction for a broader range of elements.
arXiv Detail & Related papers (2024-09-29T04:41:10Z) - Machine learning enabled experimental design and parameter estimation
for ultrafast spin dynamics [54.172707311728885]
We introduce a methodology that combines machine learning with Bayesian optimal experimental design (BOED)
Our method employs a neural network model for large-scale spin dynamics simulations for precise distribution and utility calculations in BOED.
Our numerical benchmarks demonstrate the superior performance of our method in guiding XPFS experiments, predicting model parameters, and yielding more informative measurements within limited experimental time.
arXiv Detail & Related papers (2023-06-03T06:19:20Z) - Neural networks trained on synthetically generated crystals can extract
structural information from ICSD powder X-ray diffractograms [0.6906005491572401]
Machine learning techniques have successfully been used to extract structural information from powder X-ray diffractograms.
We propose an alternative approach of generating synthetic crystals with random coordinates by using the symmetry operations of each space group.
We demonstrate online training of deep ResNet-like models on up to a few million unique on-the-fly generated synthetic diffractograms per hour.
arXiv Detail & Related papers (2023-03-21T09:37:29Z) - Tracking perovskite crystallization via deep learning-based feature
detection on 2D X-ray scattering data [137.47124933818066]
We propose an automated pipeline for the analysis of X-ray diffraction images based on the Faster R-CNN deep learning architecture.
We demonstrate our method on real-time tracking of organic-inorganic perovskite structure crystallization and test it on two applications.
arXiv Detail & Related papers (2022-02-22T15:39:00Z) - Quaternion Factorization Machines: A Lightweight Solution to Intricate
Feature Interaction Modelling [76.89779231460193]
factorization machine (FM) is capable of automatically learning high-order interactions among features to make predictions without the need for manual feature engineering.
We propose the quaternion factorization machine (QFM) and quaternion neural factorization machine (QNFM) for sparse predictive analytics.
arXiv Detail & Related papers (2021-04-05T00:02:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.