Consistency Guided Knowledge Retrieval and Denoising in LLMs for
Zero-shot Document-level Relation Triplet Extraction
- URL: http://arxiv.org/abs/2401.13598v1
- Date: Wed, 24 Jan 2024 17:04:28 GMT
- Title: Consistency Guided Knowledge Retrieval and Denoising in LLMs for
Zero-shot Document-level Relation Triplet Extraction
- Authors: Qi Sun and Kun Huang and Xiaocui Yang and Rong Tong and Kun Zhang and
Soujanya Poria
- Abstract summary: Document-level Relation Triplet Extraction (DocRTE) is a fundamental task in information systems that aims to simultaneously extract entities with semantic relations from a document.
Existing methods heavily rely on a substantial amount of fully labeled data.
Recent advanced Large Language Models (LLMs), such as ChatGPT and LLaMA, exhibit impressive long-text generation capabilities.
- Score: 43.50683283748675
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Document-level Relation Triplet Extraction (DocRTE) is a fundamental task in
information systems that aims to simultaneously extract entities with semantic
relations from a document. Existing methods heavily rely on a substantial
amount of fully labeled data. However, collecting and annotating data for newly
emerging relations is time-consuming and labor-intensive. Recent advanced Large
Language Models (LLMs), such as ChatGPT and LLaMA, exhibit impressive long-text
generation capabilities, inspiring us to explore an alternative approach for
obtaining auto-labeled documents with new relations. In this paper, we propose
a Zero-shot Document-level Relation Triplet Extraction (ZeroDocRTE) framework,
which generates labeled data by retrieval and denoising knowledge from LLMs,
called GenRDK. Specifically, we propose a chain-of-retrieval prompt to guide
ChatGPT to generate labeled long-text data step by step. To improve the quality
of synthetic data, we propose a denoising strategy based on the consistency of
cross-document knowledge. Leveraging our denoised synthetic data, we proceed to
fine-tune the LLaMA2-13B-Chat for extracting document-level relation triplets.
We perform experiments for both zero-shot document-level relation and triplet
extraction on two public datasets. The experimental results illustrate that our
GenRDK framework outperforms strong baselines.
Related papers
- BRIEF: Bridging Retrieval and Inference for Multi-hop Reasoning via Compression [91.23933111083389]
BRIEF (Bridging Retrieval and Inference through Evidence Fusion) is a lightweight approach that performs query-aware multi-hop reasoning.
Based on our synthetic data built entirely by open-source models, BRIEF generates more concise summaries.
arXiv Detail & Related papers (2024-10-20T04:24:16Z) - Integrating Planning into Single-Turn Long-Form Text Generation [66.08871753377055]
We propose to use planning to generate long form content.
Our main novelty lies in a single auxiliary task that does not require multiple rounds of prompting or planning.
Our experiments demonstrate on two datasets from different domains, that LLMs fine-tuned with the auxiliary task generate higher quality documents.
arXiv Detail & Related papers (2024-10-08T17:02:40Z) - DiVA-DocRE: A Discriminative and Voice-Aware Paradigm for Document-Level Relation Extraction [0.3208888890455612]
We introduce a Discriminative and Voice Aware Paradigm DiVA.
Our innovation lies in transforming DocRE into a discriminative task, where the model pays attention to each relation.
Our experiments on the Re-DocRED and DocRED datasets demonstrate state-of-the-art results for the DocRTE task.
arXiv Detail & Related papers (2024-09-07T18:47:38Z) - Document-Level In-Context Few-Shot Relation Extraction via Pre-Trained Language Models [29.94694305204144]
We present a novel framework for document-level in-context few-shot relation extraction.
We evaluate our framework using DocRED, the largest publicly available dataset for document-level relation extraction.
arXiv Detail & Related papers (2023-10-17T09:10:27Z) - PromptRE: Weakly-Supervised Document-Level Relation Extraction via
Prompting-Based Data Programming [30.597623178206874]
We propose PromptRE, a novel weakly-supervised document-level relation extraction method.
PromptRE incorporates the label distribution and entity types as prior knowledge to improve the performance.
Experimental results on ReDocRED, a benchmark dataset for document-level relation extraction, demonstrate the superiority of PromptRE over baseline approaches.
arXiv Detail & Related papers (2023-10-13T17:23:17Z) - ReSel: N-ary Relation Extraction from Scientific Text and Tables by
Learning to Retrieve and Select [53.071352033539526]
We study the problem of extracting N-ary relations from scientific articles.
Our proposed method ReSel decomposes this task into a two-stage procedure.
Our experiments on three scientific information extraction datasets show that ReSel outperforms state-of-the-art baselines significantly.
arXiv Detail & Related papers (2022-10-26T02:28:02Z) - RelationPrompt: Leveraging Prompts to Generate Synthetic Data for
Zero-Shot Relation Triplet Extraction [65.4337085607711]
We introduce the task setting of Zero-Shot Relation Triplet Extraction (ZeroRTE)
Given an input sentence, each extracted triplet consists of the head entity, relation label, and tail entity where the relation label is not seen at the training stage.
We propose to synthesize relation examples by prompting language models to generate structured texts.
arXiv Detail & Related papers (2022-03-17T05:55:14Z) - SAIS: Supervising and Augmenting Intermediate Steps for Document-Level
Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction.
Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z) - Reasoning with Latent Structure Refinement for Document-Level Relation
Extraction [20.308845516900426]
We propose a novel model that empowers the relational reasoning across sentences by automatically inducing the latent document-level graph.
Specifically, our model achieves an F1 score of 59.05 on a large-scale document-level dataset (DocRED)
arXiv Detail & Related papers (2020-05-13T13:36:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.