Exploiting Global Contextual Information for Document-level Named Entity
Recognition
- URL: http://arxiv.org/abs/2106.00887v1
- Date: Wed, 2 Jun 2021 01:52:07 GMT
- Title: Exploiting Global Contextual Information for Document-level Named Entity
Recognition
- Authors: Zanbo Wang, Wei Wei, Xianling Mao, Shanshan Feng, Pan Zhou, Zhiyong He
and Sheng Jiang
- Abstract summary: We propose a model called Global Context enhanced Document-level NER (GCDoc)
At word-level, a document graph is constructed to model a wider range of dependencies between words.
At sentence-level, for appropriately modeling wider context beyond single sentence, we employ a cross-sentence module.
Our model reaches F1 score of 92.22 (93.40 with BERT) on CoNLL 2003 dataset and 88.32 (90.49 with BERT) on Ontonotes 5.0 dataset.
- Score: 46.99922251839363
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most existing named entity recognition (NER) approaches are based on sequence
labeling models, which focus on capturing the local context dependencies.
However, the way of taking one sentence as input prevents the modeling of
non-sequential global context, which is useful especially when local context
information is limited or ambiguous. To this end, we propose a model called
Global Context enhanced Document-level NER (GCDoc) to leverage global
contextual information from two levels, i.e., both word and sentence. At
word-level, a document graph is constructed to model a wider range of
dependencies between words, then obtain an enriched contextual representation
for each word via graph neural networks (GNN). To avoid the interference of
noise information, we further propose two strategies. First we apply the
epistemic uncertainty theory to find out tokens whose representations are less
reliable, thereby helping prune the document graph. Then a selective auxiliary
classifier is proposed to effectively learn the weight of edges in document
graph and reduce the importance of noisy neighbour nodes. At sentence-level,
for appropriately modeling wider context beyond single sentence, we employ a
cross-sentence module which encodes adjacent sentences and fuses it with the
current sentence representation via attention and gating mechanisms. Extensive
experiments on two benchmark NER datasets (CoNLL 2003 and Ontonotes 5.0 English
dataset) demonstrate the effectiveness of our proposed model. Our model reaches
F1 score of 92.22 (93.40 with BERT) on CoNLL 2003 dataset and 88.32 (90.49 with
BERT) on Ontonotes 5.0 dataset, achieving new state-of-the-art performance.
Related papers
- Graph-tree Fusion Model with Bidirectional Information Propagation for Long Document Classification [20.434941308959786]
Long document classification presents challenges due to their extensive content and complex structure.
Existing methods often struggle with token limits and fail to adequately model hierarchical relationships within documents.
Our approach integrates syntax trees for sentence encodings and document graphs for document encodings, which capture fine-grained syntactic relationships and broader document contexts.
arXiv Detail & Related papers (2024-10-03T19:25:01Z) - Hierarchical Attention Graph for Scientific Document Summarization in Global and Local Level [3.7651378994837104]
Long input hinders simultaneous modeling of both global high-order relations between sentences and local intra-sentence relations.
We propose HAESum, a novel approach utilizing graph neural networks to model documents based on their hierarchical discourse structure.
We validate our approach on two benchmark datasets, and the experimental results demonstrate the effectiveness of HAESum.
arXiv Detail & Related papers (2024-05-16T15:46:30Z) - Enriching Relation Extraction with OpenIE [70.52564277675056]
Relation extraction (RE) is a sub-discipline of information extraction (IE)
In this work, we explore how recent approaches for open information extraction (OpenIE) may help to improve the task of RE.
Our experiments over two annotated corpora, KnowledgeNet and FewRel, demonstrate the improved accuracy of our enriched models.
arXiv Detail & Related papers (2022-12-19T11:26:23Z) - Long Document Summarization with Top-down and Bottom-up Inference [113.29319668246407]
We propose a principled inference framework to improve summarization models on two aspects.
Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency.
We demonstrate the effectiveness of the proposed framework on a diverse set of summarization datasets.
arXiv Detail & Related papers (2022-03-15T01:24:51Z) - SMDT: Selective Memory-Augmented Neural Document Translation [53.4627288890316]
We propose a Selective Memory-augmented Neural Document Translation model to deal with documents containing large hypothesis space of context.
We retrieve similar bilingual sentence pairs from the training corpus to augment global context.
We extend the two-stream attention model with selective mechanism to capture local context and diverse global contexts.
arXiv Detail & Related papers (2022-01-05T14:23:30Z) - Sparse Structure Learning via Graph Neural Networks for Inductive
Document Classification [2.064612766965483]
We propose a novel GNN-based sparse structure learning model for inductive document classification.
Our model collects a set of trainable edges connecting disjoint words between sentences and employs structure learning to sparsely select edges with dynamic contextual dependencies.
Experiments on several real-world datasets demonstrate that the proposed model outperforms most state-of-the-art results.
arXiv Detail & Related papers (2021-12-13T02:36:04Z) - Unsupervised Keyphrase Extraction by Jointly Modeling Local and Global
Context [25.3472693740778]
Embedding based methods are widely used for unsupervised keyphrase extraction (UKE) tasks.
In this paper, we propose a novel method for UKE, where local and global contexts are jointly modeled.
arXiv Detail & Related papers (2021-09-15T13:41:10Z) - InsertGNN: Can Graph Neural Networks Outperform Humans in TOEFL Sentence
Insertion Problem? [66.70154236519186]
Sentence insertion is a delicate but fundamental NLP problem.
Current approaches in sentence ordering, text coherence, and question answering (QA) are neither suitable nor good at solving it.
We propose InsertGNN, a model that represents the problem as a graph and adopts the graph Neural Network (GNN) to learn the connection between sentences.
arXiv Detail & Related papers (2021-03-28T06:50:31Z) - Fine-Grained Named Entity Typing over Distantly Supervised Data Based on
Refined Representations [16.30478830298353]
Fine-Grained Named Entity Typing (FG-NET) is a key component in Natural Language Processing (NLP)
We propose an edge-weighted attentive graph convolution network that refines the noisy mention representations by attending over corpus-level contextual clues prior to the end classification.
Experimental evaluation shows that the proposed model outperforms the existing research by a relative score of upto 10.2% and 8.3% for macro f1 and micro f1 respectively.
arXiv Detail & Related papers (2020-04-07T17:26:36Z) - Towards Making the Most of Context in Neural Machine Translation [112.9845226123306]
We argue that previous research did not make a clear use of the global context.
We propose a new document-level NMT framework that deliberately models the local context of each sentence.
arXiv Detail & Related papers (2020-02-19T03:30:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.