Leveraging Language Representation for Material Recommendation, Ranking,
and Exploration
- URL: http://arxiv.org/abs/2305.01101v2
- Date: Sat, 20 May 2023 03:35:30 GMT
- Title: Leveraging Language Representation for Material Recommendation, Ranking,
and Exploration
- Authors: Jiaxing Qu, Yuxuan Richard Xie, Kamil M. Ciesielski, Claire E. Porter,
Eric S. Toberer, Elif Ertekin
- Abstract summary: We introduce a material discovery framework that uses natural language embeddings derived from language models as representations of compositional and structural features.
By applying the framework to thermoelectrics, we demonstrate diversified recommendations of prototype structures and identify under-studied high-performance material spaces.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data-driven approaches for material discovery and design have been
accelerated by emerging efforts in machine learning. However, general
representations of crystals to explore the vast material search space remain
limited. We introduce a material discovery framework that uses natural language
embeddings derived from language models as representations of compositional and
structural features. The discovery framework consists of a joint scheme that
first recalls relevant candidates, and next ranks the candidates based on
multiple target properties. The contextual knowledge encoded in language
representations conveys information about material properties and structures,
enabling both representational similarity analysis for recall, and multi-task
learning to share information across related properties. By applying the
framework to thermoelectrics, we demonstrate diversified recommendations of
prototype structures and identify under-studied high-performance material
spaces. The recommended materials are corroborated by first-principles
calculations and experiments, revealing novel materials with potential high
performance. Our framework provides a task-agnostic means for effective
material recommendation and can be applied to various material systems.
Related papers
- MatExpert: Decomposing Materials Discovery by Mimicking Human Experts [26.364419690908992]
MatExpert is a novel framework that leverages Large Language Models and contrastive learning to accelerate the discovery and design of new solid-state materials.
Inspired by the workflow of human materials design experts, our approach integrates three key stages: retrieval, transition, and generation.
MatExpert represents a meaningful advancement in computational material discovery using langauge-based generative models.
arXiv Detail & Related papers (2024-10-26T00:44:54Z) - From Tokens to Materials: Leveraging Language Models for Scientific Discovery [12.211984932142537]
This study investigates the application of language model embeddings to enhance material property prediction in materials science.
We demonstrate that domain-specific models, particularly MatBERT, significantly outperform general-purpose models in extracting implicit knowledge from compound names and material properties.
arXiv Detail & Related papers (2024-10-21T16:31:23Z) - MaterioMiner -- An ontology-based text mining dataset for extraction of process-structure-property entities [0.0]
We present the MaterioMiner dataset and the materials ontology where ontological concepts are associated with textual entities.
We explore the consistency between the three raters and perform fine-process-trained models to showcase the feasibility of named-process recognition model training.
arXiv Detail & Related papers (2024-08-05T21:42:59Z) - From Text to Insight: Large Language Models for Materials Science Data Extraction [4.08853418443192]
The vast majority of materials science knowledge exists in unstructured natural language.
Structured data is crucial for innovative and systematic materials design.
The advent of large language models (LLMs) represents a significant shift.
arXiv Detail & Related papers (2024-07-23T22:23:47Z) - Language Representations Can be What Recommenders Need: Findings and Potentials [57.90679739598295]
We show that item representations, when linearly mapped from advanced LM representations, yield superior recommendation performance.
This outcome suggests the possible homomorphism between the advanced language representation space and an effective item representation space for recommendation.
Our findings highlight the connection between language modeling and behavior modeling, which can inspire both natural language processing and recommender system communities.
arXiv Detail & Related papers (2024-07-07T17:05:24Z) - MatText: Do Language Models Need More than Text & Scale for Materials Modeling? [5.561723952524538]
MatText is a suite of benchmarking tools and datasets designed to systematically evaluate the performance of language models in modeling materials.
MatText provides essential tools for training and benchmarking the performance of language models in the context of materials science.
arXiv Detail & Related papers (2024-06-25T05:45:07Z) - Learning to Extract Structured Entities Using Language Models [52.281701191329]
Recent advances in machine learning have significantly impacted the field of information extraction.
We reformulate the task to be entity-centric, enabling the use of diverse metrics.
We contribute to the field by introducing Structured Entity Extraction and proposing the Approximate Entity Set OverlaP metric.
arXiv Detail & Related papers (2024-02-06T22:15:09Z) - Knowledge Graphs and Pre-trained Language Models enhanced Representation Learning for Conversational Recommender Systems [58.561904356651276]
We introduce the Knowledge-Enhanced Entity Representation Learning (KERL) framework to improve the semantic understanding of entities for Conversational recommender systems.
KERL uses a knowledge graph and a pre-trained language model to improve the semantic understanding of entities.
KERL achieves state-of-the-art results in both recommendation and response generation tasks.
arXiv Detail & Related papers (2023-12-18T06:41:23Z) - TRIE++: Towards End-to-End Information Extraction from Visually Rich
Documents [51.744527199305445]
This paper proposes a unified end-to-end information extraction framework from visually rich documents.
Text reading and information extraction can reinforce each other via a well-designed multi-modal context block.
The framework can be trained in an end-to-end trainable manner, achieving global optimization.
arXiv Detail & Related papers (2022-07-14T08:52:07Z) - Knowledge Graph Augmented Network Towards Multiview Representation
Learning for Aspect-based Sentiment Analysis [96.53859361560505]
We propose a knowledge graph augmented network (KGAN) to incorporate external knowledge with explicitly syntactic and contextual information.
KGAN captures the sentiment feature representations from multiple perspectives, i.e., context-, syntax- and knowledge-based.
Experiments on three popular ABSA benchmarks demonstrate the effectiveness and robustness of our KGAN.
arXiv Detail & Related papers (2022-01-13T08:25:53Z) - Exploiting Structured Knowledge in Text via Graph-Guided Representation
Learning [73.0598186896953]
We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs.
Building upon entity-level masked language models, our first contribution is an entity masking scheme.
In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training.
arXiv Detail & Related papers (2020-04-29T14:22:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.