AlloyBERT: Alloy Property Prediction with Large Language Models
- URL: http://arxiv.org/abs/2403.19783v1
- Date: Thu, 28 Mar 2024 19:09:46 GMT
- Title: AlloyBERT: Alloy Property Prediction with Large Language Models
- Authors: Akshat Chaudhari, Chakradhar Guntuboina, Hongshuo Huang, Amir Barati Farimani,
- Abstract summary: This study introduces AlloyBERT, a transformer encoder-based model designed to predict alloy properties using textual inputs.
By combining a tokenizer trained on our textual data and a RoBERTa encoder pre-trained and fine-tuned for this specific task, we achieved a mean squared error (MSE) of 0.00015 on the Multi Principal Elemental Alloys (MPEA) data set and 0.00611 on the Refractory Alloy Yield Strength (RAYS) dataset.
Our results highlight the potential of language models in material science and establish a foundational framework for text-based prediction of alloy properties.
- Score: 5.812284760539713
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The pursuit of novel alloys tailored to specific requirements poses significant challenges for researchers in the field. This underscores the importance of developing predictive techniques for essential physical properties of alloys based on their chemical composition and processing parameters. This study introduces AlloyBERT, a transformer encoder-based model designed to predict properties such as elastic modulus and yield strength of alloys using textual inputs. Leveraging the pre-trained RoBERTa encoder model as its foundation, AlloyBERT employs self-attention mechanisms to establish meaningful relationships between words, enabling it to interpret human-readable input and predict target alloy properties. By combining a tokenizer trained on our textual data and a RoBERTa encoder pre-trained and fine-tuned for this specific task, we achieved a mean squared error (MSE) of 0.00015 on the Multi Principal Elemental Alloys (MPEA) data set and 0.00611 on the Refractory Alloy Yield Strength (RAYS) dataset. This surpasses the performance of shallow models, which achieved a best-case MSE of 0.00025 and 0.0076 on the MPEA and RAYS datasets respectively. Our results highlight the potential of language models in material science and establish a foundational framework for text-based prediction of alloy properties that does not rely on complex underlying representations, calculations, or simulations.
Related papers
- Context is Key: A Benchmark for Forecasting with Essential Textual Information [87.3175915185287]
"Context is Key" (CiK) is a time series forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context.
We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters.
Our experiments highlight the importance of incorporating contextual information, demonstrate surprising performance when using LLM-based forecasting models, and also reveal some of their critical shortcomings.
arXiv Detail & Related papers (2024-10-24T17:56:08Z) - Accelerating the discovery of low-energy structure configurations: a computational approach that integrates first-principles calculations, Monte Carlo sampling, and Machine Learning [8.695927973994577]
We develop a physics-based data-driven approach that combines Monte Carlo sampling, first-principles DFT calculations, and Machine Learning.
We demonstrate the capabilities of the proposed approach for the particular case of a tungsten-based quaternary high-entropy alloy.
arXiv Detail & Related papers (2024-10-08T01:34:42Z) - A Large Encoder-Decoder Family of Foundation Models For Chemical Language [1.1073864511426255]
This paper introduces a large encoder-decoder chemical foundation models pre-trained on a curated dataset of 91 million SMILES samples sourced from PubChem.
Our experiments across multiple benchmark datasets validate the capacity of the proposed model in providing state-of-the-art results for different tasks.
arXiv Detail & Related papers (2024-07-24T20:30:39Z) - Decomposing and Editing Predictions by Modeling Model Computation [75.37535202884463]
We introduce a task called component modeling.
The goal of component modeling is to decompose an ML model's prediction in terms of its components.
We present COAR, a scalable algorithm for estimating component attributions.
arXiv Detail & Related papers (2024-04-17T16:28:08Z) - Fine-Tuned Language Models Generate Stable Inorganic Materials as Text [57.01994216693825]
Fine-tuning large language models on text-encoded atomistic data is simple to implement yet reliable.
We show that our strongest model can generate materials predicted to be metastable at about twice the rate of CDVAE.
Because of text prompting's inherent flexibility, our models can simultaneously be used for unconditional generation of stable material.
arXiv Detail & Related papers (2024-02-06T20:35:28Z) - Materials Informatics Transformer: A Language Model for Interpretable
Materials Properties Prediction [6.349503549199403]
We introduce our model Materials Informatics Transformer (MatInFormer) for material property prediction.
Specifically, we introduce a novel approach that involves learning the grammar of crystallography through the tokenization of pertinent space group information.
arXiv Detail & Related papers (2023-08-30T18:34:55Z) - Disentanglement via Latent Quantization [60.37109712033694]
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space.
We demonstrate the broad applicability of this approach by adding it to both basic data-re (vanilla autoencoder) and latent-reconstructing (InfoGAN) generative models.
arXiv Detail & Related papers (2023-05-28T06:30:29Z) - Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution
Detection [55.028065567756066]
Out-of-distribution (OOD) detection has recently received much attention from the machine learning community due to its importance in deploying machine learning models in real-world applications.
In this paper we propose an uncertainty quantification approach by modelling the distribution of features.
We incorporate an efficient ensemble mechanism, namely batch-ensemble, to construct the batch-ensemble neural networks (BE-SNNs) and overcome the feature collapse problem.
We show that BE-SNNs yield superior performance on several OOD benchmarks, such as the Two-Moons dataset, the FashionMNIST vs MNIST dataset, FashionM
arXiv Detail & Related papers (2022-06-26T16:00:22Z) - Prediction of liquid fuel properties using machine learning models with
Gaussian processes and probabilistic conditional generative learning [56.67751936864119]
The present work aims to construct cheap-to-compute machine learning (ML) models to act as closure equations for predicting the physical properties of alternative fuels.
Those models can be trained using the database from MD simulations and/or experimental measurements in a data-fusion-fidelity approach.
The results show that ML models can predict accurately the fuel properties of a wide range of pressure and temperature conditions.
arXiv Detail & Related papers (2021-10-18T14:43:50Z) - Machine Learning and Data Analytics for Design and Manufacturing of
High-Entropy Materials Exhibiting Mechanical or Fatigue Properties of
Interest [0.24466725954625884]
The main focus is on alloys and composites with large composition spaces for structural materials.
For each output property of interest, the corresponding driving (input) factors are identified.
The framework assumes the selection of an optimization technique suitable for the application at hand and the data available.
arXiv Detail & Related papers (2020-12-05T19:32:39Z) - Machine learning with persistent homology and chemical word embeddings
improves prediction accuracy and interpretability in metal-organic frameworks [0.07874708385247352]
We introduce an end-to-end machine learning model that automatically generates descriptors that capture a complex representation of a material's structure and chemistry.
It automatically encapsulates geometric and chemical information directly from the material system.
Our results show considerable improvement in both accuracy and transferability across targets compared to models constructed from the commonly-used, manually-curated features.
arXiv Detail & Related papers (2020-10-01T16:31:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.