Related papers: Materials Informatics Transformer: A Language Model for Interpretable Materials Properties Prediction

Materials Informatics Transformer: A Language Model for Interpretable Materials Properties Prediction

URL: http://arxiv.org/abs/2308.16259v2
Date: Fri, 1 Sep 2023 12:40:29 GMT
Title: Materials Informatics Transformer: A Language Model for Interpretable Materials Properties Prediction
Authors: Hongshuo Huang, Rishikesh Magar, Changwen Xu and Amir Barati Farimani
Abstract summary: We introduce our model Materials Informatics Transformer (MatInFormer) for material property prediction. Specifically, we introduce a novel approach that involves learning the grammar of crystallography through the tokenization of pertinent space group information.
Score: 6.349503549199403
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, the remarkable capabilities of large language models (LLMs) have been illustrated across a variety of research domains such as natural language processing, computer vision, and molecular modeling. We extend this paradigm by utilizing LLMs for material property prediction by introducing our model Materials Informatics Transformer (MatInFormer). Specifically, we introduce a novel approach that involves learning the grammar of crystallography through the tokenization of pertinent space group information. We further illustrate the adaptability of MatInFormer by incorporating task-specific data pertaining to Metal-Organic Frameworks (MOFs). Through attention visualization, we uncover the key features that the model prioritizes during property prediction. The effectiveness of our proposed model is empirically validated across 14 distinct datasets, hereby underscoring its potential for high throughput screening through accurate material property prediction.

Related papers

Supervised Pretraining for Material Property Prediction [0.36868085124383626]
Self-supervised learning (SSL) offers a promising alternative by pretraining on large, unlabeled datasets to develop foundation models. In this work, we propose supervised pretraining, where available class information serves as surrogate labels to guide learning. To further enhance representation learning, we propose a graph-based augmentation technique that injects noise to improve robustness without structurally deforming material graphs.
arXiv Detail & Related papers (2025-04-27T19:00:41Z)
Foundational Large Language Models for Materials Research [22.77591279242839]
Large Language Models (LLMs) offer opportunities to accelerate materials research through automated analysis and prediction. Here, we present LLaMat, a family of foundational models for materials science developed through continued pretraining of LLaMA models. We demonstrate that LLaMat excels in materials-specific NLP and structured information extraction while maintaining general linguistic capabilities.
arXiv Detail & Related papers (2024-12-12T18:46:38Z)
Context is Key: A Benchmark for Forecasting with Essential Textual Information [87.3175915185287]
"Context is Key" (CiK) is a time series forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context. We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters. Our experiments highlight the importance of incorporating contextual information, demonstrate surprising performance when using LLM-based forecasting models, and also reveal some of their critical shortcomings.
arXiv Detail & Related papers (2024-10-24T17:56:08Z)
From Tokens to Materials: Leveraging Language Models for Scientific Discovery [12.211984932142537]
This study investigates the application of language model embeddings to enhance material property prediction in materials science. We demonstrate that domain-specific models, particularly MatBERT, significantly outperform general-purpose models in extracting implicit knowledge from compound names and material properties.
arXiv Detail & Related papers (2024-10-21T16:31:23Z)
Boosting the Capabilities of Compact Models in Low-Data Contexts with Large Language Models and Retrieval-Augmented Generation [2.9921619703037274]
We propose a retrieval augmented generation (RAG) framework backed by a large language model (LLM) to correct the output of a smaller model for the linguistic task of morphological glossing. We leverage linguistic information to make up for the lack of data and trainable parameters, while allowing for inputs from written descriptive grammars interpreted and distilled through an LLM. We show that a compact, RAG-supported model is highly effective in data-scarce settings, achieving a new state-of-the-art for this task and our target languages.
arXiv Detail & Related papers (2024-10-01T04:20:14Z)
Cross-Modal Learning for Chemistry Property Prediction: Large Language Models Meet Graph Machine Learning [0.0]
We introduce a Multi-Modal Fusion (MMF) framework that harnesses the analytical prowess of Graph Neural Networks (GNNs) and the linguistic generative and predictive abilities of Large Language Models (LLMs) Our framework combines the effectiveness of GNNs in modeling graph-structured data with the zero-shot and few-shot learning capabilities of LLMs, enabling improved predictions while reducing the risk of overfitting.
arXiv Detail & Related papers (2024-08-27T11:10:39Z)
Multi-modal Auto-regressive Modeling via Visual Words [96.25078866446053]
We propose the concept of visual tokens, which maps the visual features to probability distributions over Large Multi-modal Models' vocabulary. We further explore the distribution of visual features in the semantic space within LMM and the possibility of using text embeddings to represent visual information.
arXiv Detail & Related papers (2024-03-12T14:58:52Z)
Fine-Tuned Language Models Generate Stable Inorganic Materials as Text [57.01994216693825]
Fine-tuning large language models on text-encoded atomistic data is simple to implement yet reliable. We show that our strongest model can generate materials predicted to be metastable at about twice the rate of CDVAE. Because of text prompting's inherent flexibility, our models can simultaneously be used for unconditional generation of stable material.
arXiv Detail & Related papers (2024-02-06T20:35:28Z)
SINC: Self-Supervised In-Context Learning for Vision-Language Tasks [64.44336003123102]
We propose a framework to enable in-context learning in large language models. A meta-model can learn on self-supervised prompts consisting of tailored demonstrations. Experiments show that SINC outperforms gradient-based methods in various vision-language tasks.
arXiv Detail & Related papers (2023-07-15T08:33:08Z)
Investigating Masking-based Data Generation in Language Models [0.0]
A feature of BERT and models with similar architecture is the objective of masked language modeling. Data augmentation is a data-driven technique widely used in machine learning. Recent studies have utilized masked language model to generate artificially augmented data for NLP downstream tasks.
arXiv Detail & Related papers (2023-06-16T16:48:27Z)
Scaling Vision-Language Models with Sparse Mixture of Experts [128.0882767889029]
We show that mixture-of-experts (MoE) techniques can achieve state-of-the-art performance on a range of benchmarks over dense models of equivalent computational cost. Our research offers valuable insights into stabilizing the training of MoE models, understanding the impact of MoE on model interpretability, and balancing the trade-offs between compute performance when scaling vision-language models.
arXiv Detail & Related papers (2023-03-13T16:00:31Z)
Entity Aware Modelling: A Survey [22.32009539611539]
Recent machine learning advances have led to new state-of-the-art response prediction models. Models built at a population level often lead to sub-optimal performance in many personalized prediction settings. In personalized prediction, the goal is to incorporate inherent characteristics of different entities to improve prediction performance.
arXiv Detail & Related papers (2023-02-16T16:33:33Z)
Model-agnostic multi-objective approach for the evolutionary discovery of mathematical models [55.41644538483948]
In modern data science, it is more interesting to understand the properties of the model, which parts could be replaced to obtain better results. We use multi-objective evolutionary optimization for composite data-driven model learning to obtain the algorithm's desired properties.
arXiv Detail & Related papers (2021-07-07T11:17:09Z)
Comparing Test Sets with Item Response Theory [53.755064720563]
We evaluate 29 datasets using predictions from 18 pretrained Transformer models on individual test examples. We find that Quoref, HellaSwag, and MC-TACO are best suited for distinguishing among state-of-the-art models. We also observe span selection task format, which is used for QA datasets like QAMR or SQuAD2.0, is effective in differentiating between strong and weak models.
arXiv Detail & Related papers (2021-06-01T22:33:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.