Not Just Plain Text! Fuel Document-Level Relation Extraction with
Explicit Syntax Refinement and Subsentence Modeling
- URL: http://arxiv.org/abs/2211.05343v1
- Date: Thu, 10 Nov 2022 05:06:37 GMT
- Title: Not Just Plain Text! Fuel Document-Level Relation Extraction with
Explicit Syntax Refinement and Subsentence Modeling
- Authors: Zhichao Duan, Xiuxing Li, Zhenyu Li, Zhuo Wang, Jianyong Wang
- Abstract summary: We propose expLicit syntAx Refinement and Subsentence mOdeliNg based framework (LARSON)
By introducing extra syntactic information, LARSON can model subsentences of arbitrary granularity and efficiently screen instructive ones.
Experimental results on three benchmark datasets (DocRED, CDR, and GDA) demonstrate that LARSON significantly outperforms existing methods.
- Score: 3.9436257406798925
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Document-level relation extraction (DocRE) aims to identify semantic labels
among entities within a single document. One major challenge of DocRE is to dig
decisive details regarding a specific entity pair from long text. However, in
many cases, only a fraction of text carries required information, even in the
manually labeled supporting evidence. To better capture and exploit instructive
information, we propose a novel expLicit syntAx Refinement and Subsentence
mOdeliNg based framework (LARSON). By introducing extra syntactic information,
LARSON can model subsentences of arbitrary granularity and efficiently screen
instructive ones. Moreover, we incorporate refined syntax into text
representations which further improves the performance of LARSON. Experimental
results on three benchmark datasets (DocRED, CDR, and GDA) demonstrate that
LARSON significantly outperforms existing methods.
Related papers
- Graph-DPEP: Decomposed Plug and Ensemble Play for Few-Shot Document Relation Extraction with Graph-of-Thoughts Reasoning [34.85741925091139]
Graph-DPEP framework is grounded in the reasoning behind triplet explanation thoughts presented in natural language.
We develop "ensemble-play", reapplying generation on the entire type list by leveraging the reasoning thoughts embedded in a sub-graph.
arXiv Detail & Related papers (2024-11-05T07:12:36Z) - Contextual Document Embeddings [77.22328616983417]
We propose two complementary methods for contextualized document embeddings.
First, an alternative contrastive learning objective that explicitly incorporates the document neighbors into the intra-batch contextual loss.
Second, a new contextual architecture that explicitly encodes neighbor document information into the encoded representation.
arXiv Detail & Related papers (2024-10-03T14:33:34Z) - DiVA-DocRE: A Discriminative and Voice-Aware Paradigm for Document-Level Relation Extraction [0.3208888890455612]
We introduce a Discriminative and Voice Aware Paradigm DiVA.
Our innovation lies in transforming DocRE into a discriminative task, where the model pays attention to each relation.
Our experiments on the Re-DocRED and DocRED datasets demonstrate state-of-the-art results for the DocRTE task.
arXiv Detail & Related papers (2024-09-07T18:47:38Z) - GEGA: Graph Convolutional Networks and Evidence Retrieval Guided Attention for Enhanced Document-level Relation Extraction [15.246183329778656]
Document-level relation extraction (DocRE) aims to extract relations between entities from unstructured document text.
To overcome these challenges, we propose GEGA, a novel model for DocRE.
We evaluate the GEGA model on three widely used benchmark datasets: DocRED, Re-DocRED, and Revisit-DocRED.
arXiv Detail & Related papers (2024-07-31T07:15:33Z) - Hypergraph based Understanding for Document Semantic Entity Recognition [65.84258776834524]
We build a novel hypergraph attention document semantic entity recognition framework, HGA, which uses hypergraph attention to focus on entity boundaries and entity categories at the same time.
Our results on FUNSD, CORD, XFUNDIE show that our method can effectively improve the performance of semantic entity recognition tasks.
arXiv Detail & Related papers (2024-07-09T14:35:49Z) - AutoRE: Document-Level Relation Extraction with Large Language Models [27.426703757501507]
We introduce AutoRE, an end-to-end DocRE model that adopts a novel RE extraction paradigm named RHF (Relation-Head-Facts)
Unlike existing approaches, AutoRE does not rely on the assumption of known relation options, making it more reflective of real-world scenarios.
Our experiments on the RE-DocRED dataset showcase AutoRE's best performance, achieving state-of-the-art results.
arXiv Detail & Related papers (2024-03-21T23:48:21Z) - DocTr: Document Transformer for Structured Information Extraction in
Documents [36.1145541816468]
We present a new formulation for structured information extraction from visually rich documents.
It aims to address the limitations of existing IOB tagging or graph-based formulations.
We represent an entity as an anchor word and a bounding box, and represent entity linking as the association between anchor words.
arXiv Detail & Related papers (2023-07-16T02:59:30Z) - Document-Level Relation Extraction with Sentences Importance Estimation
and Focusing [52.069206266557266]
Document-level relation extraction (DocRE) aims to determine the relation between two entities from a document of multiple sentences.
We propose a Sentence Estimation and Focusing (SIEF) framework for DocRE, where we design a sentence importance score and a sentence focusing loss.
Experimental results on two domains show that our SIEF not only improves overall performance, but also makes DocRE models more robust.
arXiv Detail & Related papers (2022-04-27T03:20:07Z) - Three Sentences Are All You Need: Local Path Enhanced Document Relation
Extraction [54.95848026576076]
We present an embarrassingly simple but effective method to select evidence sentences for document-level RE.
We have released our code at https://github.com/AndrewZhe/Three-Sentences-Are-All-You-Need.
arXiv Detail & Related papers (2021-06-03T12:29:40Z) - Extractive Summarization as Text Matching [123.09816729675838]
This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems.
We formulate the extractive summarization task as a semantic text matching problem.
We have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1)
arXiv Detail & Related papers (2020-04-19T08:27:57Z) - Learning to Select Bi-Aspect Information for Document-Scale Text Content
Manipulation [50.01708049531156]
We focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer.
In detail, the input is a set of structured records and a reference text for describing another recordset.
The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference.
arXiv Detail & Related papers (2020-02-24T12:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.