NLP for Knowledge Discovery and Information Extraction from Energetics
Corpora
- URL: http://arxiv.org/abs/2402.06964v1
- Date: Sat, 10 Feb 2024 14:43:08 GMT
- Title: NLP for Knowledge Discovery and Information Extraction from Energetics
Corpora
- Authors: Francis G. VanGessel, Efrem Perry, Salil Mohan, Oliver M. Barham, Mark
Cavolowsky
- Abstract summary: We present a demonstration of the utility of NLP for aiding research into energetic materials and associated systems.
The NLP method enables machine understanding of textual data, offering an automated route to knowledge discovery and information extraction.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We present a demonstration of the utility of NLP for aiding research into
energetic materials and associated systems. The NLP method enables machine
understanding of textual data, offering an automated route to knowledge
discovery and information extraction from energetics text. We apply three
established unsupervised NLP models: Latent Dirichlet Allocation, Word2Vec, and
the Transformer to a large curated dataset of energetics-related scientific
articles. We demonstrate that each NLP algorithm is capable of identifying
energetic topics and concepts, generating a language model which aligns with
Subject Matter Expert knowledge. Furthermore, we present a document
classification pipeline for energetics text. Our classification pipeline
achieves 59-76\% accuracy depending on the NLP model used, with the highest
performing Transformer model rivaling inter-annotator agreement metrics. The
NLP approaches studied in this work can identify concepts germane to energetics
and therefore hold promise as a tool for accelerating energetics research
efforts and energetics material development.
Related papers
- The Nature of NLP: Analyzing Contributions in NLP Papers [77.31665252336157]
We quantitatively investigate what constitutes NLP research by examining research papers.
Our findings reveal a rising involvement of machine learning in NLP since the early nineties.
In post-2020, there has been a resurgence of focus on language and people.
arXiv Detail & Related papers (2024-09-29T01:29:28Z) - Systematic Task Exploration with LLMs: A Study in Citation Text Generation [63.50597360948099]
Large language models (LLMs) bring unprecedented flexibility in defining and executing complex, creative natural language generation (NLG) tasks.
We propose a three-component research framework that consists of systematic input manipulation, reference data, and output measurement.
We use this framework to explore citation text generation -- a popular scholarly NLP task that lacks consensus on the task definition and evaluation metric.
arXiv Detail & Related papers (2024-07-04T16:41:08Z) - Present and Future of AI in Renewable Energy Domain : A Comprehensive Survey [0.0]
Artificial intelligence (AI) has become a crucial instrument for streamlining processes in various industries.
Nine AI-based strategies are identified here to assist Renewable Energy (RE) in contemporary power systems.
This study also addressed three main topics: using AI technology for renewable power generation, utilizing AI for renewable energy forecasting, and optimizing energy systems.
arXiv Detail & Related papers (2024-06-22T04:36:09Z) - Accelerated materials language processing enabled by GPT [5.518792725397679]
We develop generative transformer (GPT)-enabled pipelines for materials language processing.
First, we develop a GPT-enabled document classification method for screening relevant documents.
Secondly, for NER task, we design an entity-centric prompts, and learning few-shot of them improved the performance.
Finally, we develop an GPT-enabled extractive QA model, which provides improved performance and shows the possibility of automatically correcting annotations.
arXiv Detail & Related papers (2023-08-18T07:31:13Z) - An Ensemble Approach to Question Classification: Integrating Electra
Transformer, GloVe, and LSTM [0.0]
This study presents an innovative ensemble approach for question classification, combining the strengths of Electra, GloVe, and LSTM models.
Rigorously tested on the well-regarded TREC dataset, the model demonstrates how the integration of these disparate technologies can lead to superior results.
arXiv Detail & Related papers (2023-08-13T18:14:10Z) - Application of Transformers based methods in Electronic Medical Records:
A Systematic Literature Review [77.34726150561087]
This work presents a systematic literature review of state-of-the-art advances using transformer-based methods on electronic medical records (EMRs) in different NLP tasks.
arXiv Detail & Related papers (2023-04-05T22:19:42Z) - A Survey of Knowledge Enhanced Pre-trained Language Models [78.56931125512295]
We present a comprehensive review of Knowledge Enhanced Pre-trained Language Models (KE-PLMs)
For NLU, we divide the types of knowledge into four categories: linguistic knowledge, text knowledge, knowledge graph (KG) and rule knowledge.
The KE-PLMs for NLG are categorized into KG-based and retrieval-based methods.
arXiv Detail & Related papers (2022-11-11T04:29:02Z) - Efficient Methods for Natural Language Processing: A Survey [76.34572727185896]
This survey synthesizes and relates current methods and findings in efficient NLP.
We aim to provide both guidance for conducting NLP under limited resources, and point towards promising research directions for developing more efficient methods.
arXiv Detail & Related papers (2022-08-31T20:32:35Z) - Assessing the trade-off between prediction accuracy and interpretability
for topic modeling on energetic materials corpora [2.1694433437280765]
We study the trade-off between prediction accuracy and interpretability by implementing three document embedding methods.
This study was carried out on a novel labeled energetics dataset created and validated by our team of energetics experts.
arXiv Detail & Related papers (2022-06-01T21:28:21Z) - A Survey of Knowledge-Intensive NLP with Pre-Trained Language Models [185.08295787309544]
We aim to summarize the current progress of pre-trained language model-based knowledge-enhanced models (PLMKEs)
We present the challenges of PLMKEs based on the discussion regarding the three elements and attempt to provide NLP practitioners with potential directions for further research.
arXiv Detail & Related papers (2022-02-17T17:17:43Z) - The NLP Cookbook: Modern Recipes for Transformer based Deep Learning
Architectures [0.0]
Natural Language Processing models have achieved phenomenal success in linguistic and semantic tasks.
Recent NLP architectures have utilized concepts of transfer learning, pruning, quantization, and knowledge distillation to achieve moderate model sizes.
Knowledge Retrievers have been built to extricate explicit data documents from a large corpus of databases with greater efficiency and accuracy.
arXiv Detail & Related papers (2021-03-23T22:38:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.