Related papers: Lex-BERT: Enhancing BERT based NER with lexicons

Lex-BERT: Enhancing BERT based NER with lexicons

URL: http://arxiv.org/abs/2101.00396v1
Date: Sat, 2 Jan 2021 07:43:21 GMT
Title: Lex-BERT: Enhancing BERT based NER with lexicons
Authors: Wei Zhu, Daniel Cheung
Abstract summary: We represent Lex-BERT, which incorporates the lexicon information into Chinese BERT for named entity recognition tasks. Our model does not introduce any new parameters and are more efficient than FLAT.
Score: 1.6884834576352221
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this work, we represent Lex-BERT, which incorporates the lexicon information into Chinese BERT for named entity recognition (NER) tasks in a natural manner. Instead of using word embeddings and a newly designed transformer layer as in FLAT, we identify the boundary of words in the sentences using special tokens, and the modified sentence will be encoded directly by BERT. Our model does not introduce any new parameters and are more efficient than FLAT. In addition, we do not require any word embeddings accompanying the lexicon collection. Experiments on Ontonotes and ZhCrossNER show that our model outperforms FLAT and other baselines.

Related papers

SimpleBERT: A Pre-trained Model That Learns to Generate Simple Words [59.142185753887645]
In this work, we propose a continued pre-training method for text simplification. We use a small-scale simple text dataset for continued pre-training and employ two methods to identify simple words. We obtain SimpleBERT, which surpasses BERT in both lexical simplification and sentence simplification tasks.
arXiv Detail & Related papers (2022-04-16T11:28:01Z)
FiNER: Financial Numeric Entity Recognition for XBRL Tagging [29.99876910165977]
We introduce tagging as a new entity extraction task for the financial domain. We release FiNER-139, a dataset of 1.1M sentences with gold tags. We show that subword fragmentation of numeric expressions harms BERT's performance.
arXiv Detail & Related papers (2022-03-12T16:43:57Z)
MarkBERT: Marking Word Boundaries Improves Chinese BERT [67.53732128091747]
MarkBERT keeps the vocabulary being Chinese characters and inserts boundary markers between contiguous words. Compared to previous word-based BERT models, MarkBERT achieves better accuracy on text classification, keyword recognition, and semantic similarity tasks.
arXiv Detail & Related papers (2022-03-12T08:43:06Z)
Pretraining without Wordpieces: Learning Over a Vocabulary of Millions of Words [50.11559460111882]
We explore the possibility of developing BERT-style pretrained model over a vocabulary of words instead of wordpieces. Results show that, compared to standard wordpiece-based BERT, WordBERT makes significant improvements on cloze test and machine reading comprehension. Since the pipeline is language-independent, we train WordBERT for Chinese language and obtain significant gains on five natural language understanding datasets.
arXiv Detail & Related papers (2022-02-24T15:15:48Z)
DyLex: Incorporating Dynamic Lexicons into BERT for Sequence Labeling [49.3379730319246]
We propose DyLex, a plug-in lexicon incorporation approach for BERT based sequence labeling tasks. We adopt word-agnostic tag embeddings to avoid re-training the representation while updating the lexicon. Finally, we introduce a col-wise attention based knowledge fusion mechanism to guarantee the pluggability of the proposed framework.
arXiv Detail & Related papers (2021-09-18T03:15:49Z)
Charformer: Fast Character Transformers via Gradient-based Subword Tokenization [50.16128796194463]
We propose a new model inductive bias that learns a subword tokenization end-to-end as part of the model. We introduce a soft gradient-based subword tokenization module (GBST) that automatically learns latent subword representations from characters. We additionally introduce Charformer, a deep Transformer model that integrates GBST and operates on the byte level.
arXiv Detail & Related papers (2021-06-23T22:24:14Z)
Lexicon Enhanced Chinese Sequence Labelling Using BERT Adapter [15.336753753889035]
existing methods solely fuse lexicon features via a shallow and random sequence layer and do not integrate them into the bottom layers of BERT. In this paper, we propose Lexicon Enhanced BERT (LEBERT) for Chinese sequence labelling. Compared with the existing methods, our model achieves lexicon deep lexicon knowledge fusion at the lower layers of BERT.
arXiv Detail & Related papers (2021-05-15T06:13:39Z)
Evaluation of BERT and ALBERT Sentence Embedding Performance on Downstream NLP Tasks [4.955649816620742]
This paper explores on sentence embedding models for BERT and ALBERT. We take a modified BERT network with siamese and triplet network structures called Sentence-BERT (SBERT) and replace BERT with ALBERT to create Sentence-ALBERT (SALBERT)
arXiv Detail & Related papers (2021-01-26T09:14:06Z)
CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters [14.956626084281638]
We propose a new variant of BERT that drops the wordpiece system altogether and uses a Character-CNN module instead to represent entire words by consulting their characters. We show that this new model improves the performance of BERT on a variety of medical domain tasks while at the same time producing robust, word-level and open-vocabulary representations.
arXiv Detail & Related papers (2020-10-20T15:58:53Z)
Incorporating BERT into Neural Machine Translation [251.54280200353674]
We propose a new algorithm named BERT-fused model, in which we first use BERT to extract representations for an input sequence. We conduct experiments on supervised (including sentence-level and document-level translations), semi-supervised and unsupervised machine translation, and achieve state-of-the-art results on seven benchmark datasets.
arXiv Detail & Related papers (2020-02-17T08:13:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.