Semi-automatic Data Enhancement for Document-Level Relation Extraction
with Distant Supervision from Large Language Models
- URL: http://arxiv.org/abs/2311.07314v1
- Date: Mon, 13 Nov 2023 13:10:44 GMT
- Title: Semi-automatic Data Enhancement for Document-Level Relation Extraction
with Distant Supervision from Large Language Models
- Authors: Junpeng Li, Zixia Jia, Zilong Zheng
- Abstract summary: Document-level Relation Extraction (DocRE) aims to extract relations from a long context.
We propose a method integrating a large language model (LLM) and a natural language inference (NLI) module to generate relation triples.
We demonstrate the effectiveness of our approach by introducing an enhanced dataset known as DocGNRE.
- Score: 26.523153535336725
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Document-level Relation Extraction (DocRE), which aims to extract relations
from a long context, is a critical challenge in achieving fine-grained
structural comprehension and generating interpretable document representations.
Inspired by recent advances in in-context learning capabilities emergent from
large language models (LLMs), such as ChatGPT, we aim to design an automated
annotation method for DocRE with minimum human effort. Unfortunately, vanilla
in-context learning is infeasible for document-level relation extraction due to
the plenty of predefined fine-grained relation types and the uncontrolled
generations of LLMs. To tackle this issue, we propose a method integrating a
large language model (LLM) and a natural language inference (NLI) module to
generate relation triples, thereby augmenting document-level relation datasets.
We demonstrate the effectiveness of our approach by introducing an enhanced
dataset known as DocGNRE, which excels in re-annotating numerous long-tail
relation types. We are confident that our method holds the potential for
broader applications in domain-specific relation type definitions and offers
tangible benefits in advancing generalized language semantic comprehension.
Related papers
- Knowledge-Aware Query Expansion with Large Language Models for Textual and Relational Retrieval [49.42043077545341]
We propose a knowledge-aware query expansion framework, augmenting LLMs with structured document relations from knowledge graph (KG)
We leverage document texts as rich KG node representations and use document-based relation filtering for our Knowledge-Aware Retrieval (KAR)
arXiv Detail & Related papers (2024-10-17T17:03:23Z) - A Semantic Mention Graph Augmented Model for Document-Level Event Argument Extraction [12.286432133599355]
Document-level Event Argument Extraction (DEAE) aims to identify arguments and their specific roles from an unstructured document.
advanced approaches on DEAE utilize prompt-based methods to guide pre-trained language models (PLMs) in extracting arguments from input documents.
We propose a semantic mention Graph Augmented Model (GAM) to address these two problems in this paper.
arXiv Detail & Related papers (2024-03-12T08:58:07Z) - Learning to Extract Structured Entities Using Language Models [52.281701191329]
Recent advances in machine learning have significantly impacted the field of information extraction.
We reformulate the task to be entity-centric, enabling the use of diverse metrics.
We contribute to the field by introducing Structured Entity Extraction and proposing the Approximate Entity Set OverlaP metric.
arXiv Detail & Related papers (2024-02-06T22:15:09Z) - Document-Level In-Context Few-Shot Relation Extraction via Pre-Trained Language Models [29.94694305204144]
We present a novel framework for document-level in-context few-shot relation extraction.
We evaluate our framework using DocRED, the largest publicly available dataset for document-level relation extraction.
arXiv Detail & Related papers (2023-10-17T09:10:27Z) - PromptRE: Weakly-Supervised Document-Level Relation Extraction via
Prompting-Based Data Programming [30.597623178206874]
We propose PromptRE, a novel weakly-supervised document-level relation extraction method.
PromptRE incorporates the label distribution and entity types as prior knowledge to improve the performance.
Experimental results on ReDocRED, a benchmark dataset for document-level relation extraction, demonstrate the superiority of PromptRE over baseline approaches.
arXiv Detail & Related papers (2023-10-13T17:23:17Z) - How to Unleash the Power of Large Language Models for Few-shot Relation
Extraction? [28.413620806193165]
In this paper, we investigate principal methodologies, in-context learning and data generation, for few-shot relation extraction via GPT-3.5.
We observe that in-context learning can achieve performance on par with previous prompt learning approaches, and data generation with the large language model can boost previous solutions to obtain new state-of-the-art few-shot results.
arXiv Detail & Related papers (2023-05-02T15:55:41Z) - Schema-aware Reference as Prompt Improves Data-Efficient Knowledge Graph
Construction [57.854498238624366]
We propose a retrieval-augmented approach, which retrieves schema-aware Reference As Prompt (RAP) for data-efficient knowledge graph construction.
RAP can dynamically leverage schema and knowledge inherited from human-annotated and weak-supervised data as a prompt for each sample.
arXiv Detail & Related papers (2022-10-19T16:40:28Z) - Improving Long Tailed Document-Level Relation Extraction via Easy
Relation Augmentation and Contrastive Learning [66.83982926437547]
We argue that mitigating the long-tailed distribution problem is crucial for DocRE in the real-world scenario.
Motivated by the long-tailed distribution problem, we propose an Easy Relation Augmentation(ERA) method for improving DocRE.
arXiv Detail & Related papers (2022-05-21T06:15:11Z) - SAIS: Supervising and Augmenting Intermediate Steps for Document-Level
Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction.
Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z) - Learning Relation Prototype from Unlabeled Texts for Long-tail Relation
Extraction [84.64435075778988]
We propose a general approach to learn relation prototypes from unlabeled texts.
We learn relation prototypes as an implicit factor between entities.
We conduct experiments on two publicly available datasets: New York Times and Google Distant Supervision.
arXiv Detail & Related papers (2020-11-27T06:21:12Z) - Reasoning with Latent Structure Refinement for Document-Level Relation
Extraction [20.308845516900426]
We propose a novel model that empowers the relational reasoning across sentences by automatically inducing the latent document-level graph.
Specifically, our model achieves an F1 score of 59.05 on a large-scale document-level dataset (DocRED)
arXiv Detail & Related papers (2020-05-13T13:36:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.