Related papers: Searching for chromate replacements using natural language processing and machine learning algorithms

Searching for chromate replacements using natural language processing and machine learning algorithms

URL: http://arxiv.org/abs/2208.05672v1
Date: Thu, 11 Aug 2022 07:21:18 GMT
Title: Searching for chromate replacements using natural language processing and machine learning algorithms
Authors: Shujing Zhao and Nick Birbilis
Abstract summary: This study demonstrates it is possible to extract knowledge from the automated interpretation of the scientific literature and achieve expert human level insights. We have employed the Word2Vec model, previously explored by others, and the BERT model - applying them towards a unique challenge in materials engineering.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The past few years has seen the application of machine learning utilised in the exploration of new materials. As in many fields of research - the vast majority of knowledge is published as text, which poses challenges in either a consolidated or statistical analysis across studies and reports. Such challenges include the inability to extract quantitative information, and in accessing the breadth of non-numerical information. To address this issue, the application of natural language processing (NLP) has been explored in several studies to date. In NLP, assignment of high-dimensional vectors, known as embeddings, to passages of text preserves the syntactic and semantic relationship between words. Embeddings rely on machine learning algorithms and in the present work, we have employed the Word2Vec model, previously explored by others, and the BERT model - applying them towards a unique challenge in materials engineering. That challenge is the search for chromate replacements in the field of corrosion protection. From a database of over 80 million records, a down-selection of 5990 papers focused on the topic of corrosion protection were examined using NLP. This study demonstrates it is possible to extract knowledge from the automated interpretation of the scientific literature and achieve expert human level insights.

Related papers

Feature engineering vs. deep learning for paper section identification: Toward applications in Chinese medical literature [5.773921786449337]
Section identification is an important task for library science, especially knowledge management. We study the paper section identification problem in the context of Chinese medical literature analysis.
arXiv Detail & Related papers (2024-12-15T09:11:14Z)
NLP-Powered Repository and Search Engine for Academic Papers: A Case Study on Cyber Risk Literature with CyLit [9.621564860645513]
We propose a novel framework that leverages Natural Language Processing (NLP) techniques. This framework automates the retrieval, summarization, and clustering of academic literature within a specific research domain. We introduce CyLit, an NLP-powered repository specifically designed for the cyber risk literature.
arXiv Detail & Related papers (2024-09-10T05:41:40Z)
Retrieval-Enhanced Machine Learning: Synthesis and Opportunities [60.34182805429511]
Retrieval-enhancement can be extended to a broader spectrum of machine learning (ML) This work introduces a formal framework of this paradigm, Retrieval-Enhanced Machine Learning (REML), by synthesizing the literature in various domains in ML with consistent notations which is missing from the current literature. The goal of this work is to equip researchers across various disciplines with a comprehensive, formally structured framework of retrieval-enhanced models, thereby fostering interdisciplinary future research.
arXiv Detail & Related papers (2024-07-17T20:01:21Z)
Assaying on the Robustness of Zero-Shot Machine-Generated Text Detectors [57.7003399760813]
We explore advanced Large Language Models (LLMs) and their specialized variants, contributing to this field in several ways. We uncover a significant correlation between topics and detection performance. These investigations shed light on the adaptability and robustness of these detection methods across diverse topics.
arXiv Detail & Related papers (2023-12-20T10:53:53Z)
Surveying the Landscape of Text Summarization with Deep Learning: A Comprehensive Review [2.4185510826808487]
Deep learning has revolutionized natural language processing (NLP) by enabling the development of models that can learn complex representations of language data. Deep learning models for NLP typically use large amounts of data to train deep neural networks, allowing them to learn the patterns and relationships in language data. Applying deep learning to text summarization refers to the use of deep neural networks to perform text summarization tasks.
arXiv Detail & Related papers (2023-10-13T21:24:37Z)
MAGE: Machine-generated Text Detection in the Wild [82.70561073277801]
Large language models (LLMs) have achieved human-level text generation, emphasizing the need for effective AI-generated text detection. We build a comprehensive testbed by gathering texts from diverse human writings and texts generated by different LLMs. Despite challenges, the top-performing detector can identify 86.54% out-of-domain texts generated by a new LLM, indicating the feasibility for application scenarios.
arXiv Detail & Related papers (2023-05-22T17:13:29Z)
Application of Transformers based methods in Electronic Medical Records: A Systematic Literature Review [77.34726150561087]
This work presents a systematic literature review of state-of-the-art advances using transformer-based methods on electronic medical records (EMRs) in different NLP tasks.
arXiv Detail & Related papers (2023-04-05T22:19:42Z)
Text to Insight: Accelerating Organic Materials Knowledge Extraction via Deep Learning [1.2774526936067927]
This study aims to explore knowledge extraction for organic materials. We built a research dataset composed of 855 annotated and 708,376 unannotated sentences drawn from 92,667 abstracts. We used named-entity-recognition (NER) with BiLSTM-CNN-CRF deep learning model to automatically extract key knowledge from literature.
arXiv Detail & Related papers (2021-09-27T01:58:35Z)
Text Mining for Processing Interview Data in Computational Social Science [0.6820436130599382]
We use commercially available text analysis technology to process interview text data from a computational social science study. We find that topical clustering and terminological enrichment provide for convenient exploration and quantification of the responses. We encourage studies in social science to use text analysis, especially for exploratory open-ended studies.
arXiv Detail & Related papers (2020-11-28T00:44:35Z)
Generating Knowledge Graphs by Employing Natural Language Processing and Machine Learning Techniques within the Scholarly Domain [1.9004296236396943]
We present a new architecture that takes advantage of Natural Language Processing and Machine Learning methods for extracting entities and relationships from research publications. Within this research work, we i) tackle the challenge of knowledge extraction by employing several state-of-the-art Natural Language Processing and Text Mining tools. We generated a scientific knowledge graph including 109,105 triples, extracted from 26,827 abstracts of papers within the Semantic Web domain.
arXiv Detail & Related papers (2020-10-28T08:31:40Z)
A Survey of Embedding Space Alignment Methods for Language and Knowledge Graphs [77.34726150561087]
We survey the current research landscape on word, sentence and knowledge graph embedding algorithms. We provide a classification of the relevant alignment techniques and discuss benchmark datasets used in this field of research.
arXiv Detail & Related papers (2020-10-26T16:08:13Z)
Positioning yourself in the maze of Neural Text Generation: A Task-Agnostic Survey [54.34370423151014]
This paper surveys the components of modeling approaches relaying task impacts across various generation tasks such as storytelling, summarization, translation etc. We present an abstraction of the imperative techniques with respect to learning paradigms, pretraining, modeling approaches, decoding and the key challenges outstanding in the field in each of them.
arXiv Detail & Related papers (2020-10-14T17:54:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.