Related papers: Extracting Semantics from Maintenance Records

Extracting Semantics from Maintenance Records

URL: http://arxiv.org/abs/2108.05454v1
Date: Wed, 11 Aug 2021 21:23:10 GMT
Title: Extracting Semantics from Maintenance Records
Authors: Sharad Dixit, Varish Mulwad, Abhinav Saxena
Abstract summary: We develop three approaches to extracting named entity recognition from maintenance records. We develop a syntactic rules and semantic-based approach and an approach leveraging a pre-trained language model. Our evaluations on a real-world aviation maintenance records dataset show promising results.
Score: 0.2578242050187029
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Rapid progress in natural language processing has led to its utilization in a variety of industrial and enterprise settings, including in its use for information extraction, specifically named entity recognition and relation extraction, from documents such as engineering manuals and field maintenance reports. While named entity recognition is a well-studied problem, existing state-of-the-art approaches require large labelled datasets which are hard to acquire for sensitive data such as maintenance records. Further, industrial domain experts tend to distrust results from black box machine learning models, especially when the extracted information is used in downstream predictive maintenance analytics. We overcome these challenges by developing three approaches built on the foundation of domain expert knowledge captured in dictionaries and ontologies. We develop a syntactic and semantic rules-based approach and an approach leveraging a pre-trained language model, fine-tuned for a question-answering task on top of our base dictionary lookup to extract entities of interest from maintenance records. We also develop a preliminary ontology to represent and capture the semantics of maintenance records. Our evaluations on a real-world aviation maintenance records dataset show promising results and help identify challenges specific to named entity recognition in the context of noisy industrial data.

Related papers

Data Therapist: Eliciting Domain Knowledge from Subject Matter Experts Using Large Language Models [17.006423792670414]
We present the Data Therapist, a web-based tool that helps domain experts externalize implicit knowledge through a mixed-initiative process. The resulting structured knowledge base can inform both human and automated visualization design.
arXiv Detail & Related papers (2025-05-01T11:10:17Z)
Benchmarking pre-trained text embedding models in aligning built asset information [0.0]
This study presents a comparative benchmark of state-of-the-art text embedding models to evaluate their effectiveness in aligning built asset information with domain-specific technical concepts. The results of our benchmarking across six proposed datasets, covering three tasks of clustering, retrieval, and reranking, highlight the need for future research on domain adaptation techniques.
arXiv Detail & Related papers (2024-11-18T20:54:17Z)
Empowering Domain-Specific Language Models with Graph-Oriented Databases: A Paradigm Shift in Performance and Model Maintenance [0.0]
Our work is driven by the need to manage and process large volumes of short text documents inherent in specific application domains. By leveraging domain-specific knowledge and expertise, our approach aims to shape factual data within these domains. Our work underscores the transformative potential of the partnership of domain-specific language models and graph-oriented databases.
arXiv Detail & Related papers (2024-10-04T19:02:09Z)
Computational Job Market Analysis with Natural Language Processing [5.117211717291377]
This thesis investigates Natural Language Processing (NLP) technology for extracting relevant information from job descriptions. We frame the problem, obtaining annotated data, and introducing extraction methodologies. Our contributions include job description datasets, a de-identification dataset, and a novel active learning algorithm for efficient model training.
arXiv Detail & Related papers (2024-04-29T14:52:38Z)
Unearthing Large Scale Domain-Specific Knowledge from Public Corpora [103.0865116794534]
We introduce large models into the data collection pipeline to guide the generation of domain-specific information.<n>We refer to this approach as Retrieve-from-CC.<n>It not only collects data related to domain-specific knowledge but also mines the data containing potential reasoning procedures from the public corpus.
arXiv Detail & Related papers (2024-01-26T03:38:23Z)
Capture the Flag: Uncovering Data Insights with Large Language Models [90.47038584812925]
This study explores the potential of using Large Language Models (LLMs) to automate the discovery of insights in data. We propose a new evaluation methodology based on a "capture the flag" principle, measuring the ability of such models to recognize meaningful and pertinent information (flags) in a dataset.
arXiv Detail & Related papers (2023-12-21T14:20:06Z)
Informed Named Entity Recognition Decoding for Generative Language Models [3.5323691899538128]
We propose Informed Named Entity Recognition Decoding (iNERD), which treats named entity recognition as a generative process. We coarse-tune our model on a merged named entity corpus to strengthen its performance, evaluate five generative language models on eight named entity recognition datasets, and achieve remarkable results.
arXiv Detail & Related papers (2023-08-15T14:16:29Z)
Gradient Imitation Reinforcement Learning for General Low-Resource Information Extraction [80.64518530825801]
We develop a Gradient Reinforcement Learning (GIRL) method to encourage pseudo-labeled data to imitate the gradient descent direction on labeled data. We also leverage GIRL to solve all IE sub-tasks (named entity recognition, relation extraction, and event extraction) in low-resource settings.
arXiv Detail & Related papers (2022-11-11T05:37:19Z)
Schema-aware Reference as Prompt Improves Data-Efficient Knowledge Graph Construction [57.854498238624366]
We propose a retrieval-augmented approach, which retrieves schema-aware Reference As Prompt (RAP) for data-efficient knowledge graph construction. RAP can dynamically leverage schema and knowledge inherited from human-annotated and weak-supervised data as a prompt for each sample.
arXiv Detail & Related papers (2022-10-19T16:40:28Z)
Deep Learning based pipeline for anomaly detection and quality enhancement in industrial binder jetting processes [68.8204255655161]
Anomaly detection describes methods of finding abnormal states, instances or data points that differ from a normal value space. This paper contributes to a data-centric way of approaching artificial intelligence in industrial production.
arXiv Detail & Related papers (2022-09-21T08:14:34Z)
Knowledge Graph Anchored Information-Extraction for Domain-Specific Insights [1.6308268213252761]
We use a task-based approach for fulfilling specific information needs within a new domain. A pipeline constructed of state of the art NLP technologies is used to automatically extract an instance level semantic structure.
arXiv Detail & Related papers (2021-04-18T19:28:10Z)
Streaming Self-Training via Domain-Agnostic Unlabeled Images [62.57647373581592]
We present streaming self-training (SST) that aims to democratize the process of learning visual recognition models. Key to SST are two crucial observations: (1) domain-agnostic unlabeled images enable us to learn better models with a few labeled examples without any additional knowledge or supervision; and (2) learning is a continuous process and can be done by constructing a schedule of learning updates.
arXiv Detail & Related papers (2021-04-07T17:58:39Z)
Predicting Themes within Complex Unstructured Texts: A Case Study on Safeguarding Reports [66.39150945184683]
We focus on the problem of automatically identifying the main themes in a safeguarding report using supervised classification approaches. Our results show the potential of deep learning models to simulate subject-expert behaviour even for complex tasks with limited labelled data.
arXiv Detail & Related papers (2020-10-27T19:48:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.