MatKB: Semantic Search for Polycrystalline Materials Synthesis
Procedures
- URL: http://arxiv.org/abs/2302.05597v1
- Date: Sat, 11 Feb 2023 04:18:07 GMT
- Title: MatKB: Semantic Search for Polycrystalline Materials Synthesis
Procedures
- Authors: Xianjun Yang, Stephen Wilson, Linda Petzold
- Abstract summary: Our goal is to automatically mine structured knowledge from millions of research articles in the field of polycrystalline materials.
The proposed method leverages NLP techniques such as entity recognition and document classification to extract relevant information.
The resulting knowledge base is integrated into a search engine, which enables users to search for information about specific materials, properties, and experiments with greater precision than traditional search engines like Google.
- Score: 2.578242050187029
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we present a novel approach to knowledge extraction and
retrieval using Natural Language Processing (NLP) techniques for material
science. Our goal is to automatically mine structured knowledge from millions
of research articles in the field of polycrystalline materials and make it
easily accessible to the broader community. The proposed method leverages NLP
techniques such as entity recognition and document classification to extract
relevant information and build an extensive knowledge base, from a collection
of 9.5 Million publications. The resulting knowledge base is integrated into a
search engine, which enables users to search for information about specific
materials, properties, and experiments with greater precision than traditional
search engines like Google. We hope our results can enable material scientists
quickly locate desired experimental procedures, compare their differences, and
even inspire them to design new experiments. Our website will be available at
Github \footnote{https://github.com/Xianjun-Yang/PcMSP.git} soon.
Related papers
- Knowledge Navigator: LLM-guided Browsing Framework for Exploratory Search in Scientific Literature [48.572336666741194]
We present Knowledge Navigator, a system designed to enhance exploratory search abilities.
It organizes retrieved documents into a navigable, two-level hierarchy of named and descriptive scientific topics and subtopics.
arXiv Detail & Related papers (2024-08-28T14:48:37Z) - From Text to Insight: Large Language Models for Materials Science Data Extraction [4.08853418443192]
The vast majority of materials science knowledge exists in unstructured natural language.
Structured data is crucial for innovative and systematic materials design.
The advent of large language models (LLMs) represents a significant shift.
arXiv Detail & Related papers (2024-07-23T22:23:47Z) - Construction and Application of Materials Knowledge Graph in Multidisciplinary Materials Science via Large Language Model [16.030268397865264]
This article introduces the Materials Knowledge Graph (MKG), which utilizes advanced natural language processing techniques.
MKG categorizes information into comprehensive labels such as Name, Formula, and Application, structured around a meticulously designed ontology.
By implementing network-based algorithms, MKG not only facilitates efficient link prediction but also significantly reduces reliance on traditional experimental methods.
arXiv Detail & Related papers (2024-04-03T21:46:14Z) - Large Language Models for Generative Information Extraction: A Survey [89.71273968283616]
Large Language Models (LLMs) have demonstrated remarkable capabilities in text understanding and generation.
We present an extensive overview by categorizing these works in terms of various IE subtasks and techniques.
We empirically analyze the most advanced methods and discover the emerging trend of IE tasks with LLMs.
arXiv Detail & Related papers (2023-12-29T14:25:22Z) - Agent-based Learning of Materials Datasets from Scientific Literature [0.0]
We develop a chemist AI agent, powered by large language models (LLMs), to create structured datasets from natural language text.
Our chemist AI agent, Eunomia, can plan and execute actions by leveraging the existing knowledge from decades of scientific research articles.
arXiv Detail & Related papers (2023-12-18T20:29:58Z) - Reconstructing Materials Tetrahedron: Challenges in Materials Information Extraction [23.489721319567025]
We discuss, quantify, and document challenges in automated information extraction from materials science literature.
This information is spread in multiple formats, such as tables, text, and images, and with little or no uniformity in reporting style.
We hope the present work inspires researchers to address the challenges in a coherent fashion, providing a fillip to IE towards developing a materials knowledge base.
arXiv Detail & Related papers (2023-10-12T14:57:24Z) - GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training
Data Exploration [97.68234051078997]
We discuss how Pyserini can be integrated with the Hugging Face ecosystem of open-source AI libraries and artifacts.
We include a Jupyter Notebook-based walk through the core interoperability features, available on GitHub.
We present GAIA Search - a search engine built following previously laid out principles, giving access to four popular large-scale text collections.
arXiv Detail & Related papers (2023-06-02T12:09:59Z) - Artificial Intelligence in Concrete Materials: A Scientometric View [77.34726150561087]
This chapter aims to uncover the main research interests and knowledge structure of the existing literature on AI for concrete materials.
To begin with, a total of 389 journal articles published from 1990 to 2020 were retrieved from the Web of Science.
Scientometric tools such as keyword co-occurrence analysis and documentation co-citation analysis were adopted to quantify features and characteristics of the research field.
arXiv Detail & Related papers (2022-09-17T18:24:56Z) - Text to Insight: Accelerating Organic Materials Knowledge Extraction via
Deep Learning [1.2774526936067927]
This study aims to explore knowledge extraction for organic materials.
We built a research dataset composed of 855 annotated and 708,376 unannotated sentences drawn from 92,667 abstracts.
We used named-entity-recognition (NER) with BiLSTM-CNN-CRF deep learning model to automatically extract key knowledge from literature.
arXiv Detail & Related papers (2021-09-27T01:58:35Z) - A New Neural Search and Insights Platform for Navigating and Organizing
AI Research [56.65232007953311]
We introduce a new platform, AI Research Navigator, that combines classical keyword search with neural retrieval to discover and organize relevant literature.
We give an overview of the overall architecture of the system and of the components for document analysis, question answering, search, analytics, expert search, and recommendations.
arXiv Detail & Related papers (2020-10-30T19:12:25Z) - Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research
Dataset: Preliminary Thoughts and Lessons Learned [88.42878484408469]
We present the Neural Covidex, a search engine that exploits the latest neural ranking architectures.
This paper describes our initial efforts and offers a few thoughts about lessons we have learned along the way.
arXiv Detail & Related papers (2020-04-10T17:12:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.