Challenges in Non-Polymeric Crystal Structure Prediction: Why a Geometric, Permutation-Invariant Loss is Needed
- URL: http://arxiv.org/abs/2509.00832v3
- Date: Fri, 26 Sep 2025 11:33:29 GMT
- Title: Challenges in Non-Polymeric Crystal Structure Prediction: Why a Geometric, Permutation-Invariant Loss is Needed
- Authors: Emmanuel Jehanno, Romain Menegaux, Julien Mairal, Sergei Grudinin,
- Abstract summary: We focus on the molecular assembly problem, where a set $mathcalS$ of identical rigid molecules is packed to form a crystalline structure.<n>We propose a better formulation that introduces a loss function capturing key geometric properties.<n>Remarkably, we demonstrate that within this framework, a simple regression model already outperforms prior approaches.
- Score: 16.986704284824363
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Crystalline structure prediction is an essential prerequisite for designing materials with targeted properties. Yet, it is still an open challenge in materials design and drug discovery. Despite recent advances in computational materials science, accurately predicting three-dimensional non-polymeric crystal structures remains elusive. In this work, we focus on the molecular assembly problem, where a set $\mathcal{S}$ of identical rigid molecules is packed to form a crystalline structure. Such a simplified formulation provides a useful approximation to the actual problem. However, while recent state-of-the-art methods have increasingly adopted sophisticated techniques, the underlying learning objective remains ill-posed. We propose a better formulation that introduces a loss function capturing key geometric molecular properties while ensuring permutation invariance over $\mathcal{S}$. Remarkably, we demonstrate that within this framework, a simple regression model already outperforms prior approaches, including flow matching techniques, on the COD-Cluster17 benchmark, a curated non-polymeric subset of the Crystallography Open Database (COD).
Related papers
- Universal Fine-Grained Symmetry Inference and Enforcement for Rigorous Crystal Structure Prediction [3.5802790319269717]
Crystal structure prediction (CSP) aims to predict the three-dimensional atomic arrangement of a crystal from its composition.<n>Existing deep learning models often treat crystallographic symmetry only as a soft or rely on space group and Wyckoff templates retrieved from known structures.<n>In contrast, our approach leverages large language models to encode chemical semantics and directly generate fine-grained Wyckoff patterns from composition.
arXiv Detail & Related papers (2026-02-19T08:43:25Z) - DMFlow: Disordered Materials Generation by Flow Matching [24.578652236376385]
DMFlow is a generative framework specifically designed for disordered crystals.<n>It employs a flow matching model to jointly generate all structural components.<n>We release a benchmark containing SD, PD, and mixed structures curated from the Crystallography Open Database.
arXiv Detail & Related papers (2026-02-04T16:36:58Z) - OXtal: An All-Atom Diffusion Model for Organic Crystal Structure Prediction [63.318434943975255]
We introduce OXtal, a large-scale 100M parameter all-atom diffusion model that learns the conditional joint distribution over intramolecular conformations and periodic packing.<n>By leveraging a large dataset of 600K experimentally validated crystal structures, OXtal achieves orders-of-improvement over prior ab initio machine learning CSP methods.<n> OXtal attains over 80% packing similarity rate, demonstrating its ability to model both thermodynamic and kinetic regularities of molecular crystallization.
arXiv Detail & Related papers (2025-12-07T20:46:30Z) - Machine Learning Workflow for Analysis of High-Dimensional Order Parameter Space: A Case Study of Polymer Crystallization from Molecular Dynamics Simulations [0.0]
identification of crystallization pathways in polymers is currently carried out using molecular simulation-based data.<n>In this study, an integrated machine learning workflow is presented to accurately quantify crystallinity.
arXiv Detail & Related papers (2025-07-23T23:02:10Z) - Invariant Tokenization of Crystalline Materials for Language Model Enabled Generation [82.91073155506277]
Key step is to convert 3D crystal structures into 1D sequences to be processed by language models (LMs)<n>Mat2Seq converts 3D crystal structures into 1D sequences and ensures that different mathematical descriptions of the same crystal are represented in a single unique sequence.<n> Experimental results show that, with language models, Mat2Seq achieves promising performance in crystal structure generation as compared with prior methods.
arXiv Detail & Related papers (2025-02-28T20:02:53Z) - Unleashing the power of novel conditional generative approaches for new materials discovery [3.972733741872872]
We propose two generative approaches to the problem of crystal structure design.
One is conditional structure modification, using the energy difference between the most energetically favorable structure and all its less stable polymorphs.
The other is conditional structure generation, using the energy difference between the most energetically favorable structure and all its less stable polymorphs.
arXiv Detail & Related papers (2024-11-05T14:58:31Z) - GraphXForm: Graph transformer for computer-aided molecular design [73.1842164721868]
We present GraphXForm, a decoder-only graph transformer architecture, which is pretrained on existing compounds.<n>We evaluate it on various drug design tasks, demonstrating superior objective scores compared to state-of-the-art molecular design approaches.
arXiv Detail & Related papers (2024-11-03T19:45:15Z) - Complete and Efficient Graph Transformers for Crystal Material Property Prediction [53.32754046881189]
Crystal structures are characterized by atomic bases within a primitive unit cell that repeats along a regular lattice throughout 3D space.
We introduce a novel approach that utilizes the periodic patterns of unit cells to establish the lattice-based representation for each atom.
We propose ComFormer, a SE(3) transformer designed specifically for crystalline materials.
arXiv Detail & Related papers (2024-03-18T15:06:37Z) - Crystalformer: Infinitely Connected Attention for Periodic Structure Encoding [10.170537065646323]
Predicting physical properties of materials from their crystal structures is a fundamental problem in materials science.
We show that crystal structures are infinitely repeating, periodic arrangements of atoms, whose fully connected attention results in infinitely connected attention.
We propose a simple yet effective Transformer-based encoder architecture for crystal structures called Crystalformer.
arXiv Detail & Related papers (2024-03-18T11:37:42Z) - Scalable Diffusion for Materials Generation [99.71001883652211]
We develop a unified crystal representation that can represent any crystal structure (UniMat)
UniMat can generate high fidelity crystal structures from larger and more complex chemical systems.
We propose additional metrics for evaluating generative models of materials.
arXiv Detail & Related papers (2023-10-18T15:49:39Z) - Latent Conservative Objective Models for Data-Driven Crystal Structure
Prediction [62.36797874900395]
In computational chemistry, crystal structure prediction is an optimization problem.
One approach to tackle this problem involves building simulators based on density functional theory (DFT) followed by running search in simulation.
We show that our approach, dubbed LCOMs (latent conservative objective models), performs comparably to the best current approaches in terms of success rate of structure prediction.
arXiv Detail & Related papers (2023-10-16T04:35:44Z) - Data-Driven Score-Based Models for Generating Stable Structures with
Adaptive Crystal Cells [1.515687944002438]
This work aims at the generation of new crystal structures with desired properties, such as chemical stability and specified chemical composition.
The novelty of the presented approach resides in the fact that the lattice of the crystal cell is not fixed.
A multigraph crystal representation is introduced that respects symmetry constraints, yielding computational advantages.
arXiv Detail & Related papers (2023-10-16T02:53:24Z) - Geometric Transformer for End-to-End Molecule Properties Prediction [92.28929858529679]
We introduce a Transformer-based architecture for molecule property prediction, which is able to capture the geometry of the molecule.
We modify the classical positional encoder by an initial encoding of the molecule geometry, as well as a learned gated self-attention mechanism.
arXiv Detail & Related papers (2021-10-26T14:14:40Z) - A deep learning driven pseudospectral PCE based FFT homogenization
algorithm for complex microstructures [68.8204255655161]
It is shown that the proposed method is able to predict central moments of interest while being magnitudes faster to evaluate than traditional approaches.
It is shown, that the proposed method is able to predict central moments of interest while being magnitudes faster to evaluate than traditional approaches.
arXiv Detail & Related papers (2021-10-26T07:02:14Z) - SPANet: Generalized Permutationless Set Assignment for Particle Physics
using Symmetry Preserving Attention [62.43586180025247]
Collisions at the Large Hadron Collider produce variable-size sets of observed particles.
Physical symmetries of decay products complicate assignment of observed particles to decay products.
We introduce a novel method for constructing symmetry-preserving attention networks.
arXiv Detail & Related papers (2021-06-07T18:18:20Z) - Crystal structure prediction of materials with high symmetry using
differential evolution [0.5249805590164902]
We propose a contact map-based crystal structure prediction method, which uses genetic algorithms to maximize the match between the contact map of predicted structure and the contact map of the real crystal structure.
When predicting the crystal structure with high symmetry, we find that the global optimization algorithm has difficulty to find an effective combination of WPs that satisfies the chemical formula.
Our proposed algorithm CMCrystalHS can effectively solve the problem of inconsistent contact map dimensions and predict the crystal structures with high symmetry.
arXiv Detail & Related papers (2021-04-20T05:10:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.