Knowledge Graph Anchored Information-Extraction for Domain-Specific
Insights
- URL: http://arxiv.org/abs/2104.08936v2
- Date: Tue, 20 Apr 2021 02:44:06 GMT
- Title: Knowledge Graph Anchored Information-Extraction for Domain-Specific
Insights
- Authors: Vivek Khetan, Annervaz K M, Erin Wetherley, Elena Eneva, Shubhashis
Sengupta, and Andrew E. Fano
- Abstract summary: We use a task-based approach for fulfilling specific information needs within a new domain.
A pipeline constructed of state of the art NLP technologies is used to automatically extract an instance level semantic structure.
- Score: 1.6308268213252761
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The growing quantity and complexity of data pose challenges for humans to
consume information and respond in a timely manner. For businesses in domains
with rapidly changing rules and regulations, failure to identify changes can be
costly. In contrast to expert analysis or the development of domain-specific
ontology and taxonomies, we use a task-based approach for fulfilling specific
information needs within a new domain. Specifically, we propose to extract
task-based information from incoming instance data. A pipeline constructed of
state of the art NLP technologies, including a bi-LSTM-CRF model for entity
extraction, attention-based deep Semantic Role Labeling, and an automated
verb-based relationship extractor, is used to automatically extract an instance
level semantic structure. Each instance is then combined with a larger,
domain-specific knowledge graph to produce new and timely insights. Preliminary
results, validated manually, show the methodology to be effective for
extracting specific information to complete end use-cases.
Related papers
- Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data.
We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation.
Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z) - Decoding Time Series with LLMs: A Multi-Agent Framework for Cross-Domain Annotation [56.78444462585225]
TESSA is a multi-agent system designed to automatically generate both general and domain-specific annotations for time series data.
General agent captures common patterns and knowledge across multiple source domains, leveraging both time-series-wise and text-wise features.
The domain-specific agent utilizes limited annotations from the target domain to learn domain-specific terminology and generate targeted annotations.
arXiv Detail & Related papers (2024-10-22T22:43:14Z) - Domain-Specific Retrieval-Augmented Generation Using Vector Stores, Knowledge Graphs, and Tensor Factorization [7.522493227357079]
Large Language Models (LLMs) are pre-trained on large-scale corpora.
LLMs suffer from hallucinations, knowledge cut-offs, and lack of knowledge attributions.
We introduce SMART-SLIC, a highly domain-specific LLM framework.
arXiv Detail & Related papers (2024-10-03T17:40:55Z) - Learning to Discover Knowledge: A Weakly-Supervised Partial Domain Adaptation Approach [20.899013563493202]
Domain adaptation has shown appealing performance by leveraging knowledge from a source domain with rich annotations.
For a specific target task, it is cumbersome to collect related and high-quality source domains.
In this paper, we propose a simple yet effective domain adaptation approach, termed as self-paced transfer classifier learning (SP-TCL)
arXiv Detail & Related papers (2024-06-20T12:54:07Z) - A Continual Relation Extraction Approach for Knowledge Graph Completeness [0.0]
This thesis aims to develop a novel continual relation extraction method to identify relations between entities in a data stream coming from the real world.
Domain-specific data of this thesis is corona news from German and Austrian newspapers.
arXiv Detail & Related papers (2024-04-20T18:15:52Z) - Instruct and Extract: Instruction Tuning for On-Demand Information
Extraction [86.29491354355356]
On-Demand Information Extraction aims to fulfill the personalized demands of real-world users.
We present a benchmark named InstructIE, inclusive of both automatically generated training data, as well as the human-annotated test set.
Building on InstructIE, we further develop an On-Demand Information Extractor, ODIE.
arXiv Detail & Related papers (2023-10-24T17:54:25Z) - Coarse-to-fine Knowledge Graph Domain Adaptation based on
Distantly-supervised Iterative Training [12.62127290494378]
We propose an integrated framework for adapting and re-learning knowledge graphs.
No manual data annotation is required to train the model.
We introduce a novel iterative training strategy to facilitate the discovery of domain-specific named entities and triples.
arXiv Detail & Related papers (2022-11-05T08:16:38Z) - A Multi-Format Transfer Learning Model for Event Argument Extraction via
Variational Information Bottleneck [68.61583160269664]
Event argument extraction (EAE) aims to extract arguments with given roles from texts.
We propose a multi-format transfer learning model with variational information bottleneck.
We conduct extensive experiments on three benchmark datasets, and obtain new state-of-the-art performance on EAE.
arXiv Detail & Related papers (2022-08-27T13:52:01Z) - Streaming Self-Training via Domain-Agnostic Unlabeled Images [62.57647373581592]
We present streaming self-training (SST) that aims to democratize the process of learning visual recognition models.
Key to SST are two crucial observations: (1) domain-agnostic unlabeled images enable us to learn better models with a few labeled examples without any additional knowledge or supervision; and (2) learning is a continuous process and can be done by constructing a schedule of learning updates.
arXiv Detail & Related papers (2021-04-07T17:58:39Z) - Inferring Latent Domains for Unsupervised Deep Domain Adaptation [54.963823285456925]
Unsupervised Domain Adaptation (UDA) refers to the problem of learning a model in a target domain where labeled data are not available.
This paper introduces a novel deep architecture which addresses the problem of UDA by automatically discovering latent domains in visual datasets.
We evaluate our approach on publicly available benchmarks, showing that it outperforms state-of-the-art domain adaptation methods.
arXiv Detail & Related papers (2021-03-25T14:33:33Z) - Coupling semantic and statistical techniques for dynamically enriching
web ontologies [0.0]
We propose an automatic coupled statistical/semantic framework for dynamically enriching large-scale generic from the World Wide Web.
The benefits of our approach are: (i) proposing the dynamic enrichment of large-scale semantic patterns with missing background knowledge, and thus, enabling the reuse of such knowledge.
arXiv Detail & Related papers (2020-04-23T11:21:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.