FLERT: Document-Level Features for Named Entity Recognition
- URL: http://arxiv.org/abs/2011.06993v2
- Date: Fri, 14 May 2021 07:15:07 GMT
- Title: FLERT: Document-Level Features for Named Entity Recognition
- Authors: Stefan Schweter, Alan Akbik
- Abstract summary: Current state-of-the-art approaches for named entity recognition (NER) typically consider text at the sentence-level.
The use of transformer-based models for NER offers natural options for capturing document-level features.
- Score: 5.27294900215066
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current state-of-the-art approaches for named entity recognition (NER)
typically consider text at the sentence-level and thus do not model information
that crosses sentence boundaries. However, the use of transformer-based models
for NER offers natural options for capturing document-level features. In this
paper, we perform a comparative evaluation of document-level features in the
two standard NER architectures commonly considered in the literature, namely
"fine-tuning" and "feature-based LSTM-CRF". We evaluate different
hyperparameters for document-level features such as context window size and
enforcing document-locality. We present experiments from which we derive
recommendations for how to model document context and present new
state-of-the-art scores on several CoNLL-03 benchmark datasets. Our approach is
integrated into the Flair framework to facilitate reproduction of our
experiments.
Related papers
- Hypergraph based Understanding for Document Semantic Entity Recognition [65.84258776834524]
We build a novel hypergraph attention document semantic entity recognition framework, HGA, which uses hypergraph attention to focus on entity boundaries and entity categories at the same time.
Our results on FUNSD, CORD, XFUNDIE show that our method can effectively improve the performance of semantic entity recognition tasks.
arXiv Detail & Related papers (2024-07-09T14:35:49Z) - DREQ: Document Re-Ranking Using Entity-based Query Understanding [6.675805308519988]
DREQ is an entity-oriented dense document re-ranking model.
We emphasize the query-relevant entities within a document's representation while simultaneously attenuating the less relevant ones.
We show that DREQ outperforms state-of-the-art neural and non-neural re-ranking methods.
arXiv Detail & Related papers (2024-01-11T14:27:12Z) - On Search Strategies for Document-Level Neural Machine Translation [51.359400776242786]
Document-level neural machine translation (NMT) models produce a more consistent output across a document.
In this work, we aim to answer the question how to best utilize a context-aware translation model in decoding.
arXiv Detail & Related papers (2023-06-08T11:30:43Z) - Few-Shot Document-Level Relation Extraction [0.0]
We present document-level relation extraction benchmark (FSDLRE)
We argue that document-level corpora provide more realism, particularly regarding none-of-the-above (NOTA) distributions.
We adapt the state-of-the-art sentence-level method MNAV to the document-level and develop it further for improved domain adaptation.
arXiv Detail & Related papers (2022-05-04T13:16:19Z) - Long Document Summarization with Top-down and Bottom-up Inference [113.29319668246407]
We propose a principled inference framework to improve summarization models on two aspects.
Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency.
We demonstrate the effectiveness of the proposed framework on a diverse set of summarization datasets.
arXiv Detail & Related papers (2022-03-15T01:24:51Z) - Aspect-based Document Similarity for Research Papers [4.661692753666685]
We extend similarity with aspect information by performing a pairwise document classification task.
We evaluate our aspect-based document similarity for research papers.
Our results show SciBERT as the best performing system.
arXiv Detail & Related papers (2020-10-13T13:51:21Z) - Document-level Neural Machine Translation with Document Embeddings [82.4684444847092]
This work focuses on exploiting detailed document-level context in terms of multiple forms of document embeddings.
The proposed document-aware NMT is implemented to enhance the Transformer baseline by introducing both global and local document-level clues on the source end.
arXiv Detail & Related papers (2020-09-16T19:43:29Z) - SPECTER: Document-level Representation Learning using Citation-informed
Transformers [51.048515757909215]
SPECTER generates document-level embedding of scientific documents based on pretraining a Transformer language model.
We introduce SciDocs, a new evaluation benchmark consisting of seven document-level tasks ranging from citation prediction to document classification and recommendation.
arXiv Detail & Related papers (2020-04-15T16:05:51Z) - Towards Making the Most of Context in Neural Machine Translation [112.9845226123306]
We argue that previous research did not make a clear use of the global context.
We propose a new document-level NMT framework that deliberately models the local context of each sentence.
arXiv Detail & Related papers (2020-02-19T03:30:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.