A More Efficient Chinese Named Entity Recognition base on BERT and
Syntactic Analysis
- URL: http://arxiv.org/abs/2101.11423v1
- Date: Mon, 11 Jan 2021 15:33:39 GMT
- Title: A More Efficient Chinese Named Entity Recognition base on BERT and
Syntactic Analysis
- Authors: Xiao Fu and Guijun Zhang
- Abstract summary: This paper first uses Stanford natural language process (NLP) tool to annotate large-scale untagged data; then a new NLP model, g-BERT model, is designed to compress Bidirectional Representations from Transformers (BERT) model in order to reduce calculation quantity.
The experimental results show that the calculation quantity in g-BERT model is reduced by 60% and performance improves by 2% with Test F1 to 96.5 compared with that in BERT model.
- Score: 9.769870656657522
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a new Named entity recognition (NER) method to effectively make
use of the results of Part-of-speech (POS) tagging, Chinese word segmentation
(CWS) and parsing while avoiding NER error caused by POS tagging error. This
paper first uses Stanford natural language process (NLP) tool to annotate
large-scale untagged data so as to reduce the dependence on the tagged data;
then a new NLP model, g-BERT model, is designed to compress Bidirectional
Encoder Representations from Transformers (BERT) model in order to reduce
calculation quantity; finally, the model is evaluated based on Chinese NER
dataset. The experimental results show that the calculation quantity in g-BERT
model is reduced by 60% and performance improves by 2% with Test F1 to 96.5
compared with that in BERT model.
Related papers
- Rethinking Masked Language Modeling for Chinese Spelling Correction [70.85829000570203]
We study Chinese Spelling Correction (CSC) as a joint decision made by two separate models: a language model and an error model.
We find that fine-tuning BERT tends to over-fit the error model while under-fit the language model, resulting in poor generalization to out-of-distribution error patterns.
We demonstrate that a very simple strategy, randomly masking 20% non-error tokens from the input sequence during fine-tuning is sufficient for learning a much better language model without sacrificing the error model.
arXiv Detail & Related papers (2023-05-28T13:19:12Z) - Transformer-based approaches to Sentiment Detection [55.41644538483948]
We examined the performance of four different types of state-of-the-art transformer models for text classification.
The RoBERTa transformer model performs best on the test dataset with a score of 82.6% and is highly recommended for quality predictions.
arXiv Detail & Related papers (2023-03-13T17:12:03Z) - CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual
Labeled Sequence Translation [113.99145386490639]
Cross-lingual NER can transfer knowledge between languages via aligned cross-lingual representations or machine translation results.
We propose a Cross-lingual Entity Projection framework (CROP) to enable zero-shot cross-lingual NER.
We adopt a multilingual labeled sequence translation model to project the tagged sequence back to the target language and label the target raw sentence.
arXiv Detail & Related papers (2022-10-13T13:32:36Z) - WCL-BBCD: A Contrastive Learning and Knowledge Graph Approach to Named
Entity Recognition [15.446770390648874]
We propose a novel named entity recognition model WCL-BBCD (Word Contrastive Learning with BERT-BiLSTM-CRF-DBpedia)
The model first trains the sentence pairs in the text, calculate similarity between words in sentence pairs by cosine similarity, and fine-tunes the BERT model used for the named entity recognition task through the similarity.
Finally, the recognition results are corrected in combination with prior knowledge such as knowledge graphs, so as to alleviate the recognition caused by word abbreviations low-rate problem.
arXiv Detail & Related papers (2022-03-14T08:29:58Z) - Semantic Similarity Computing Model Based on Multi Model Fine-Grained
Nonlinear Fusion [30.71123144365683]
This paper proposes a novel model based on multi model nonlinear fusion to grasp the meaning of a text from a global perspective.
The model uses the Jaccard coefficient based on part of speech, Term Frequency-Inverse Document Frequency (TF-IDF) and word2vec-CNN algorithm to measure the similarity of sentences.
Experimental results show that the matching of sentence similarity calculation method based on multi model nonlinear fusion is 84%, and the F1 value of the model is 75%.
arXiv Detail & Related papers (2022-02-05T03:12:37Z) - Deploying a BERT-based Query-Title Relevance Classifier in a Production
System: a View from the Trenches [3.1219977244201056]
Bidirectional Representations from Transformers (BERT) model has been radically improving the performance of many Natural Language Processing (NLP) tasks.
It is challenging to scale BERT for low-latency and high- throughput industrial use cases due to its enormous size.
We successfully optimize a Query-Title Relevance (QTR) classifier for deployment via a compact model, which we name BERT Bidirectional Long Short-Term Memory (BertBiLSTM)
BertBiLSTM exceeds the off-the-shelf BERT model's performance in terms of accuracy and efficiency for the aforementioned real-world production task
arXiv Detail & Related papers (2021-08-23T14:28:23Z) - BinaryBERT: Pushing the Limit of BERT Quantization [74.65543496761553]
We propose BinaryBERT, which pushes BERT quantization to the limit with weight binarization.
We find that a binary BERT is hard to be trained directly than a ternary counterpart due to its complex and irregular loss landscapes.
Empirical results show that BinaryBERT has negligible performance drop compared to the full-precision BERT-base.
arXiv Detail & Related papers (2020-12-31T16:34:54Z) - Fine-Tuning BERT for Sentiment Analysis of Vietnamese Reviews [0.0]
Experimental results on two datasets show thatmodels using BERT slightly outperform other models usingGloVe and FastText.
Our proposed BERT fine-tuning method produces amodel with better performance than the original BERT fine-tuning method.
arXiv Detail & Related papers (2020-11-20T14:45:46Z) - Explaining and Improving Model Behavior with k Nearest Neighbor
Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions.
We show that kNN representations are effective at uncovering learned spurious associations.
Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z) - Abstractive Text Summarization based on Language Model Conditioning and
Locality Modeling [4.525267347429154]
We train a Transformer-based neural model on the BERT language model.
In addition, we propose a new method of BERT-windowing, which allows chunk-wise processing of texts longer than the BERT window size.
The results of our models are compared to a baseline and the state-of-the-art models on the CNN/Daily Mail dataset.
arXiv Detail & Related papers (2020-03-29T14:00:17Z) - Parameter Space Factorization for Zero-Shot Learning across Tasks and
Languages [112.65994041398481]
We propose a Bayesian generative model for the space of neural parameters.
We infer the posteriors over such latent variables based on data from seen task-language combinations.
Our model yields comparable or better results than state-of-the-art, zero-shot cross-lingual transfer methods.
arXiv Detail & Related papers (2020-01-30T16:58:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.