Related papers: Fine-Tuned Language Models Generate Stable Inorganic Materials as Text

Fine-Tuned Language Models Generate Stable Inorganic Materials as Text

URL: http://arxiv.org/abs/2402.04379v1
Date: Tue, 6 Feb 2024 20:35:28 GMT
Title: Fine-Tuned Language Models Generate Stable Inorganic Materials as Text
Authors: Nate Gruver, Anuroop Sriram, Andrea Madotto, Andrew Gordon Wilson, C. Lawrence Zitnick, Zachary Ulissi
Abstract summary: Fine-tuning large language models on text-encoded atomistic data is simple to implement yet reliable. We show that our strongest model can generate materials predicted to be metastable at about twice the rate of CDVAE. Because of text prompting's inherent flexibility, our models can simultaneously be used for unconditional generation of stable material.
Score: 57.01994216693825
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose fine-tuning large language models for generation of stable materials. While unorthodox, fine-tuning large language models on text-encoded atomistic data is simple to implement yet reliable, with around 90% of sampled structures obeying physical constraints on atom positions and charges. Using energy above hull calculations from both learned ML potentials and gold-standard DFT calculations, we show that our strongest model (fine-tuned LLaMA-2 70B) can generate materials predicted to be metastable at about twice the rate (49% vs 28%) of CDVAE, a competing diffusion model. Because of text prompting's inherent flexibility, our models can simultaneously be used for unconditional generation of stable material, infilling of partial structures and text-conditional generation. Finally, we show that language models' ability to capture key symmetries of crystal structures improves with model scale, suggesting that the biases of pretrained LLMs are surprisingly well-suited for atomistic data.

Related papers

CLOUD: A Scalable and Physics-Informed Foundation Model for Crystal Representation Learning [0.0]
We introduce CLOUD (Crystal Language mOdel for Unified and Differentiable materials modeling), a transformer-based framework trained on a novel Symmetry-Consistent SCOPE (SCOPE)<n>CLOUD is pre-trained on over six million crystal structures and achieves competitive performance in predicting a wide range of material properties.<n>As proof of concept of differentiable materials modeling, CLOUD is applied to predict the phonon internal energy and heat capacity.
arXiv Detail & Related papers (2025-06-19T15:45:24Z)
Using (Not so) Large Language Models for Generating Simulation Models in a Formal DSL -- A Study on Reaction Networks [0.0]
We evaluate how a Large Language Model might be used for formalizing natural language into simulation models. We develop a synthetic data generator to serve as the basis for fine-tuning and evaluation. Our evaluation shows that our fine-tuned Mistral model can recover the ground truth simulation model in up to 84.5% of cases.
arXiv Detail & Related papers (2025-03-03T15:48:01Z)
Scalable Language Models with Posterior Inference of Latent Thought Vectors [52.63299874322121]
Latent-Thought Language Models (LTMs) incorporate explicit latent thought vectors that follow an explicit prior model in latent space. LTMs possess additional scaling dimensions beyond traditional LLMs, yielding a structured design space. LTMs significantly outperform conventional autoregressive models and discrete diffusion models in validation perplexity and zero-shot language modeling.
arXiv Detail & Related papers (2025-02-03T17:50:34Z)
Text to Band Gap: Pre-trained Language Models as Encoders for Semiconductor Band Gap Prediction [5.812284760539713]
We investigate the use of transformer-based language models, RoBERTa, T5, and LLaMA, for predicting the band gaps of semiconductor materials.<n>We construct material descriptions in two formats: structured strings that combine key features in a consistent template, and natural language narratives generated using the ChatGPT API.<n>Our results show that finetuned language models, particularly the decoder-only LLaMA-3 architecture, can outperform conventional approaches in prediction accuracy and flexibility.
arXiv Detail & Related papers (2025-01-07T00:56:26Z)
Materials Learning Algorithms (MALA): Scalable Machine Learning for Electronic Structure Calculations in Large-Scale Atomistic Simulations [2.04071520659173]
We present the Materials Learning Algorithms (MALA) package, a scalable machine learning framework suitable for large-scale atomistic simulations. MALA models efficiently predict key electronic observables, including local density of states, electronic density, density of states, and total energy. We demonstrate MALA's capabilities with examples including boron clusters, aluminum across its solid-liquid phase boundary, and predicting the electronic structure of a stacking fault in a large beryllium slab.
arXiv Detail & Related papers (2024-11-29T11:10:29Z)
Unlocking the Potential of Model Merging for Low-Resource Languages [66.7716891808697]
Adapting large language models to new languages typically involves continual pre-training (CT) followed by supervised fine-tuning (SFT) We propose model merging as an alternative for low-resource languages, combining models with distinct capabilities into a single model without additional training. Experiments based on Llama-2-7B demonstrate that model merging effectively endows LLMs for low-resource languages with task-solving abilities, outperforming CT-then-SFT in scenarios with extremely scarce data.
arXiv Detail & Related papers (2024-07-04T15:14:17Z)
Large language models, physics-based modeling, experimental measurements: the trinity of data-scarce learning of polymer properties [10.955525128731654]
Large language models (LLMs) bear promise as a fast and accurate material modeling paradigm for evaluation, analysis, and design. We present a physics-based training pipeline that tackles the pathology of data scarcity.
arXiv Detail & Related papers (2024-07-03T02:57:40Z)
Scalable Diffusion for Materials Generation [99.71001883652211]
We develop a unified crystal representation that can represent any crystal structure (UniMat) UniMat can generate high fidelity crystal structures from larger and more complex chemical systems. We propose additional metrics for evaluating generative models of materials.
arXiv Detail & Related papers (2023-10-18T15:49:39Z)
FiLM: Fill-in Language Models for Any-Order Generation [71.42044325886194]
Fill-in Language Model (FiLM) is a new language modeling approach that allows for flexible generation at any position without adhering to a specific generation order. During inference, FiLM can seamlessly insert missing phrases, sentences, or paragraphs. FiLM outperforms existing infilling methods that rely on left-to-right language models trained on rearranged text segments.
arXiv Detail & Related papers (2023-10-15T19:37:39Z)
Discovering Interpretable Physical Models using Symbolic Regression and Discrete Exterior Calculus [55.2480439325792]
We propose a framework that combines Symbolic Regression (SR) and Discrete Exterior Calculus (DEC) for the automated discovery of physical models. DEC provides building blocks for the discrete analogue of field theories, which are beyond the state-of-the-art applications of SR to physical problems. We prove the effectiveness of our methodology by re-discovering three models of Continuum Physics from synthetic experimental data.
arXiv Detail & Related papers (2023-10-10T13:23:05Z)
Materials Informatics Transformer: A Language Model for Interpretable Materials Properties Prediction [6.349503549199403]
We introduce our model Materials Informatics Transformer (MatInFormer) for material property prediction. Specifically, we introduce a novel approach that involves learning the grammar of crystallography through the tokenization of pertinent space group information.
arXiv Detail & Related papers (2023-08-30T18:34:55Z)
Materials Transformers Language Models for Generative Materials Design: a benchmark study [4.047301375093173]
We train seven modern transformer language models (GPT, GPT-2, GPT-Neo, GPT-J, BLMM, BART, and RoBERTa) using the expanded formulas from material deposited in the ICSD, OQMD, and Materials Projects databases. Six different datasets with/out non-charge-neutral or balanced electronegativity samples are used to benchmark the performances. Experiments showed that the causal language models based materials transformers can generate chemically valid materials compositions with as high as 97.54% to be charge neutral and 91.40% to be electronegativity balanced.
arXiv Detail & Related papers (2022-06-27T18:50:05Z)
Crystal Transformer: Self-learning neural language model for Generative and Tinkering Design of Materials [4.813020904720316]
BLMM Crystal Transformer is a neural network based probabilistic generative model for generative and tinkering design of inorganic materials. It can generate chemically valid materials compositions with as high as 89.7% charge neutrality and 84.8% balanced electronegativity. A user-friendly web app has been developed for computational materials doping and can be accessed freely at urlwww.materialsatlas.org/blmtinker.
arXiv Detail & Related papers (2022-04-25T20:20:26Z)
Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models [61.768082640087]
We explore joint energy-based model (EBM) training during the finetuning of pretrained text encoders for natural language understanding tasks. Experiments show that EBM training can help the model reach a better calibration that is competitive to strong baselines.
arXiv Detail & Related papers (2021-01-18T01:41:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.