CrystalICL: Enabling In-Context Learning for Crystal Generation
- URL: http://arxiv.org/abs/2508.20143v1
- Date: Wed, 27 Aug 2025 07:49:27 GMT
- Title: CrystalICL: Enabling In-Context Learning for Crystal Generation
- Authors: Ruobing Wang, Qiaoyu Tan, Yili Wang, Ying Wang, Xin Wang,
- Abstract summary: Large language models (LLMs) have demonstrated strong in-context learning (ICL) capabilities.<n>Existing LLM-based crystal generation approaches are limited to zero-shot scenarios and are unable to benefit from few-shot scenarios.<n>We propose CrystalICL, a novel model designed for few-shot crystal generation.
- Score: 12.641999605656409
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Designing crystal materials with desired physicochemical properties remains a fundamental challenge in materials science. While large language models (LLMs) have demonstrated strong in-context learning (ICL) capabilities, existing LLM-based crystal generation approaches are limited to zero-shot scenarios and are unable to benefit from few-shot scenarios. In contrast, human experts typically design new materials by modifying relevant known structures which aligns closely with the few-shot ICL paradigm. Motivated by this, we propose CrystalICL, a novel model designed for few-shot crystal generation. Specifically, we introduce a space-group based crystal tokenization method, which effectively reduces the complexity of modeling crystal symmetry in LLMs. We further introduce a condition-structure aware hybrid instruction tuning framework and a multi-task instruction tuning strategy, enabling the model to better exploit ICL by capturing structure-property relationships from limited data. Extensive experiments on four crystal generation benchmarks demonstrate the superiority of CrystalICL over the leading baseline methods on conditional and unconditional generation tasks.
Related papers
- OXtal: An All-Atom Diffusion Model for Organic Crystal Structure Prediction [63.318434943975255]
We introduce OXtal, a large-scale 100M parameter all-atom diffusion model that learns the conditional joint distribution over intramolecular conformations and periodic packing.<n>By leveraging a large dataset of 600K experimentally validated crystal structures, OXtal achieves orders-of-improvement over prior ab initio machine learning CSP methods.<n> OXtal attains over 80% packing similarity rate, demonstrating its ability to model both thermodynamic and kinetic regularities of molecular crystallization.
arXiv Detail & Related papers (2025-12-07T20:46:30Z) - CrystalDiT: A Diffusion Transformer for Crystal Generation [52.45780803467369]
We present CrystalDiT, a diffusion transformer for crystal structure generation that achieves state-of-the-art performance.<n>CrystalDiT employs a unified transformer that imposes a powerful inductive bias: treating lattice and atomic properties as a single, interdependent system.
arXiv Detail & Related papers (2025-08-13T12:53:32Z) - Invariant Tokenization of Crystalline Materials for Language Model Enabled Generation [82.91073155506277]
Key step is to convert 3D crystal structures into 1D sequences to be processed by language models (LMs)<n>Mat2Seq converts 3D crystal structures into 1D sequences and ensures that different mathematical descriptions of the same crystal are represented in a single unique sequence.<n> Experimental results show that, with language models, Mat2Seq achieves promising performance in crystal structure generation as compared with prior methods.
arXiv Detail & Related papers (2025-02-28T20:02:53Z) - Large Language Models Are Innate Crystal Structure Generators [30.44669215588058]
We show that pre-trained Large Language Models can inherently generate stable crystal structures without additional training.<n>Our framework MatLLMSearch integrates pre-trained LLMs with evolutionary search algorithms, achieving a 78.38% metastable rate.
arXiv Detail & Related papers (2025-02-28T10:41:16Z) - A Generation Framework with Strict Constraints for Crystal Materials Design [8.736399863675524]
We present a new constrained generation framework that takes multiple constraints as input and enables the generation of crystal structures with specific chemical and properties.<n>Our method generates crystal structures with a probability of meeting the target properties that is more than twice that of existing approaches.
arXiv Detail & Related papers (2024-11-13T09:36:50Z) - Crystal-LSBO: Automated Design of De Novo Crystals with Latent Space Bayesian Optimization [11.988832749427077]
We introduce Crystal-LSBO, a de novo design framework for crystals specifically tailored to enhance explorability.
Our study pioneers the use of LSBO for de novo crystal design, demonstrating its efficacy through optimization tasks.
arXiv Detail & Related papers (2024-05-28T07:03:49Z) - Space Group Informed Transformer for Crystalline Materials Generation [2.405914457225118]
We introduce CrystalFormer, a transformer-based autoregressive model specifically designed for space group-controlled generation of crystalline materials.
The incorporation of space group symmetry significantly simplifies the crystal space, which is crucial for data and compute efficient generative modeling of crystalline materials.
arXiv Detail & Related papers (2024-03-23T06:01:45Z) - Scalable Diffusion for Materials Generation [99.71001883652211]
We develop a unified crystal representation that can represent any crystal structure (UniMat)
UniMat can generate high fidelity crystal structures from larger and more complex chemical systems.
We propose additional metrics for evaluating generative models of materials.
arXiv Detail & Related papers (2023-10-18T15:49:39Z) - Latent Conservative Objective Models for Data-Driven Crystal Structure
Prediction [62.36797874900395]
In computational chemistry, crystal structure prediction is an optimization problem.
One approach to tackle this problem involves building simulators based on density functional theory (DFT) followed by running search in simulation.
We show that our approach, dubbed LCOMs (latent conservative objective models), performs comparably to the best current approaches in terms of success rate of structure prediction.
arXiv Detail & Related papers (2023-10-16T04:35:44Z) - Data-Driven Score-Based Models for Generating Stable Structures with
Adaptive Crystal Cells [1.515687944002438]
This work aims at the generation of new crystal structures with desired properties, such as chemical stability and specified chemical composition.
The novelty of the presented approach resides in the fact that the lattice of the crystal cell is not fixed.
A multigraph crystal representation is introduced that respects symmetry constraints, yielding computational advantages.
arXiv Detail & Related papers (2023-10-16T02:53:24Z) - Crystal-GFN: sampling crystals with desirable properties and constraints [103.79058968784163]
We introduce Crystal-GFN, a generative model of crystal structures that sequentially samples structural properties of crystalline materials.
In this paper, we use as objective the formation energy per atom of a crystal structure predicted by a new proxy machine learning model trained on MatBench.
The results demonstrate that Crystal-GFN is able to sample highly diverse crystals with low (median -3.1 eV/atom) predicted formation energy.
arXiv Detail & Related papers (2023-10-07T21:36:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.