Exploring Cross-sentence Contexts for Named Entity Recognition with BERT
- URL: http://arxiv.org/abs/2006.01563v2
- Date: Thu, 17 Dec 2020 16:32:55 GMT
- Title: Exploring Cross-sentence Contexts for Named Entity Recognition with BERT
- Authors: Jouni Luoma, Sampo Pyysalo
- Abstract summary: We present a study exploring the use of cross-sentence information for NER using BERT models in five languages.
We find that adding context in the form of additional sentences to BERT input increases NER performance on all of the tested languages and models.
We propose a straightforward method, Contextual Majority Voting (CMV), to combine different predictions for sentences and demonstrate this to further increase NER performance with BERT.
- Score: 1.4998865865537996
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Named entity recognition (NER) is frequently addressed as a sequence
classification task where each input consists of one sentence of text. It is
nevertheless clear that useful information for the task can often be found
outside of the scope of a single-sentence context. Recently proposed
self-attention models such as BERT can both efficiently capture long-distance
relationships in input as well as represent inputs consisting of several
sentences, creating new opportunitites for approaches that incorporate
cross-sentence information in natural language processing tasks. In this paper,
we present a systematic study exploring the use of cross-sentence information
for NER using BERT models in five languages. We find that adding context in the
form of additional sentences to BERT input systematically increases NER
performance on all of the tested languages and models. Including multiple
sentences in each input also allows us to study the predictions of the same
sentences in different contexts. We propose a straightforward method,
Contextual Majority Voting (CMV), to combine different predictions for
sentences and demonstrate this to further increase NER performance with BERT.
Our approach does not require any changes to the underlying BERT architecture,
rather relying on restructuring examples for training and prediction.
Evaluation on established datasets, including the CoNLL'02 and CoNLL'03 NER
benchmarks, demonstrates that our proposed approach can improve on the
state-of-the-art NER results on English, Dutch, and Finnish, achieves the best
reported BERT-based results on German, and is on par with performance reported
with other BERT-based approaches in Spanish. We release all methods implemented
in this work under open licenses.
Related papers
- In-Context Learning for Few-Shot Nested Named Entity Recognition [53.55310639969833]
We introduce an effective and innovative ICL framework for the setting of few-shot nested NER.
We improve the ICL prompt by devising a novel example demonstration selection mechanism, EnDe retriever.
In EnDe retriever, we employ contrastive learning to perform three types of representation learning, in terms of semantic similarity, boundary similarity, and label similarity.
arXiv Detail & Related papers (2024-02-02T06:57:53Z) - Enriching Relation Extraction with OpenIE [70.52564277675056]
Relation extraction (RE) is a sub-discipline of information extraction (IE)
In this work, we explore how recent approaches for open information extraction (OpenIE) may help to improve the task of RE.
Our experiments over two annotated corpora, KnowledgeNet and FewRel, demonstrate the improved accuracy of our enriched models.
arXiv Detail & Related papers (2022-12-19T11:26:23Z) - CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual
Labeled Sequence Translation [113.99145386490639]
Cross-lingual NER can transfer knowledge between languages via aligned cross-lingual representations or machine translation results.
We propose a Cross-lingual Entity Projection framework (CROP) to enable zero-shot cross-lingual NER.
We adopt a multilingual labeled sequence translation model to project the tagged sequence back to the target language and label the target raw sentence.
arXiv Detail & Related papers (2022-10-13T13:32:36Z) - BERT for Sentiment Analysis: Pre-trained and Fine-Tuned Alternatives [0.0]
BERT has revolutionized the NLP field by enabling transfer learning with large language models.
This article studies how to better cope with the different embeddings provided by the BERT output layer and the usage of language-specific instead of multilingual models.
arXiv Detail & Related papers (2022-01-10T15:05:05Z) - ConSERT: A Contrastive Framework for Self-Supervised Sentence
Representation Transfer [19.643512923368743]
We present ConSERT, a Contrastive Framework for Self-Supervised Sentence Representation Transfer.
By making use of unlabeled texts, ConSERT solves the collapse issue of BERT-derived sentence representations.
Experiments on STS datasets demonstrate that ConSERT achieves an 8% relative improvement over the previous state-of-the-art.
arXiv Detail & Related papers (2021-05-25T08:15:01Z) - Improving BERT with Syntax-aware Local Attention [14.70545694771721]
We propose a syntax-aware local attention, where the attention scopes are based on the distances in the syntactic structure.
We conduct experiments on various single-sentence benchmarks, including sentence classification and sequence labeling tasks.
Our model achieves better performance owing to more focused attention over syntactically relevant words.
arXiv Detail & Related papers (2020-12-30T13:29:58Z) - Table Search Using a Deep Contextualized Language Model [20.041167804194707]
In this paper, we use the deep contextualized language model BERT for the task of ad hoc table retrieval.
We propose an approach that incorporates features from prior literature on table retrieval and jointly trains them with BERT.
arXiv Detail & Related papers (2020-05-19T04:18:04Z) - Cross-Lingual Low-Resource Set-to-Description Retrieval for Global
E-Commerce [83.72476966339103]
Cross-lingual information retrieval is a new task in cross-border e-commerce.
We propose a novel cross-lingual matching network (CLMN) with the enhancement of context-dependent cross-lingual mapping.
Experimental results indicate that our proposed CLMN yields impressive results on the challenging task.
arXiv Detail & Related papers (2020-05-17T08:10:51Z) - BURT: BERT-inspired Universal Representation from Twin Structure [89.82415322763475]
BURT (BERT inspired Universal Representation from Twin Structure) is capable of generating universal, fixed-size representations for input sequences of any granularity.
Our proposed BURT adopts the Siamese network, learning sentence-level representations from natural language inference dataset and word/phrase-level representations from paraphrasing dataset.
We evaluate BURT across different granularities of text similarity tasks, including STS tasks, SemEval2013 Task 5(a) and some commonly used word similarity tasks.
arXiv Detail & Related papers (2020-04-29T04:01:52Z) - Cross-lingual Information Retrieval with BERT [8.052497255948046]
We explore the use of the popular bidirectional language model, BERT, to model and learn the relevance between English queries and foreign-language documents.
A deep relevance matching model based on BERT is introduced and trained by finetuning a pretrained multilingual BERT model with weak supervision.
Experimental results of the retrieval of Lithuanian documents against short English queries show that our model is effective and outperforms the competitive baseline approaches.
arXiv Detail & Related papers (2020-04-24T23:32:13Z) - Incorporating BERT into Neural Machine Translation [251.54280200353674]
We propose a new algorithm named BERT-fused model, in which we first use BERT to extract representations for an input sequence.
We conduct experiments on supervised (including sentence-level and document-level translations), semi-supervised and unsupervised machine translation, and achieve state-of-the-art results on seven benchmark datasets.
arXiv Detail & Related papers (2020-02-17T08:13:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.