Positional Attention for Efficient BERT-Based Named Entity Recognition
- URL: http://arxiv.org/abs/2505.01868v1
- Date: Sat, 03 May 2025 17:17:05 GMT
- Title: Positional Attention for Efficient BERT-Based Named Entity Recognition
- Authors: Mo Sun, Siheng Xiong, Yuankai Cai, Bowen Zuo,
- Abstract summary: We present a framework for Named Entity Recognition (NER) leveraging the Bidirectional Representations from Transformers (BERT) model in natural language processing (NLP)<n>We propose a cost-efficient approach that integrates positional attention mechanisms into the entity recognition process and enables effective customization using pre-trained parameters.<n>This work contributes to the field by offering a practical solution for reducing the training cost of BERT-based NER systems while maintaining high accuracy.
- Score: 1.2345322051083512
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a framework for Named Entity Recognition (NER) leveraging the Bidirectional Encoder Representations from Transformers (BERT) model in natural language processing (NLP). NER is a fundamental task in NLP with broad applicability across downstream applications. While BERT has established itself as a state-of-the-art model for entity recognition, fine-tuning it from scratch for each new application is computationally expensive and time-consuming. To address this, we propose a cost-efficient approach that integrates positional attention mechanisms into the entity recognition process and enables effective customization using pre-trained parameters. The framework is evaluated on a Kaggle dataset derived from the Groningen Meaning Bank corpus and achieves strong performance with fewer training epochs. This work contributes to the field by offering a practical solution for reducing the training cost of BERT-based NER systems while maintaining high accuracy.
Related papers
- On Significance of Subword tokenization for Low Resource and Efficient
Named Entity Recognition: A case study in Marathi [1.6383036433216434]
We focus on NER for low-resource language and present our case study in the context of the Indian language Marathi.
We propose a hybrid approach for efficient NER by integrating a BERT-based subword tokenizer into vanilla CNN/LSTM models.
We show that this simple approach of replacing a traditional word-based tokenizer with a BERT-tokenizer brings the accuracy of vanilla single-layer models closer to that of deep pre-trained models like BERT.
arXiv Detail & Related papers (2023-12-03T06:53:53Z) - DPBERT: Efficient Inference for BERT based on Dynamic Planning [11.680840266488884]
Existing input-adaptive inference methods fail to take full advantage of the structure of BERT.
We propose Dynamic Planning in BERT, a novel fine-tuning strategy that can accelerate the inference process of BERT.
Our method reduces latency to 75% while maintaining 98% accuracy, yielding a better accuracy-speed trade-off compared to state-of-the-art input-adaptive methods.
arXiv Detail & Related papers (2023-07-26T07:18:50Z) - Global Pointer: Novel Efficient Span-based Approach for Named Entity
Recognition [7.226094340165499]
Named entity recognition (NER) task aims at identifying entities from a piece of text that belong to predefined semantic types.
The state-of-the-art solutions for flat entities NER commonly suffer from capturing the fine-grained semantic information in underlying texts.
We propose a novel span-based NER framework, namely Global Pointer (GP), that leverages the relative positions through a multiplicative attention mechanism.
arXiv Detail & Related papers (2022-08-05T09:19:46Z) - BiBERT: Accurate Fully Binarized BERT [69.35727280997617]
BiBERT is an accurate fully binarized BERT to eliminate the performance bottlenecks.
Our method yields impressive 56.3 times and 31.2 times saving on FLOPs and model size.
arXiv Detail & Related papers (2022-03-12T09:46:13Z) - Distantly-Supervised Named Entity Recognition with Noise-Robust Learning
and Language Model Augmented Self-Training [66.80558875393565]
We study the problem of training named entity recognition (NER) models using only distantly-labeled data.
We propose a noise-robust learning scheme comprised of a new loss function and a noisy label removal step.
Our method achieves superior performance, outperforming existing distantly-supervised NER models by significant margins.
arXiv Detail & Related papers (2021-09-10T17:19:56Z) - AutoTriggER: Label-Efficient and Robust Named Entity Recognition with
Auxiliary Trigger Extraction [54.20039200180071]
We present a novel framework to improve NER performance by automatically generating and leveraging entity triggers''
Our framework leverages post-hoc explanation to generate rationales and strengthens a model's prior knowledge using an embedding technique.
AutoTriggER shows strong label-efficiency, is capable of generalizing to unseen entities, and outperforms the RoBERTa-CRF baseline by nearly 0.5 F1 points on average.
arXiv Detail & Related papers (2021-09-10T08:11:56Z) - Coarse-to-Fine Pre-training for Named Entity Recognition [26.00489191164784]
We propose a NER-specific pre-training framework to in-ject coarse-to-fine automatically mined entityknowledge into pre-trained models.
Our framework achieves significant improvements against several pre-trained base-lines, establishing the new state-of-the-art per-formance on three benchmarks.
arXiv Detail & Related papers (2020-10-16T07:39:20Z) - BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant
Supervision [49.42215511723874]
We propose a new computational framework -- BOND -- to improve the prediction performance of NER models.
Specifically, we propose a two-stage training algorithm: In the first stage, we adapt the pre-trained language model to the NER tasks using the distant labels.
In the second stage, we drop the distant labels, and propose a self-training approach to further improve the model performance.
arXiv Detail & Related papers (2020-06-28T04:55:39Z) - MC-BERT: Efficient Language Pre-Training via a Meta Controller [96.68140474547602]
Large-scale pre-training is computationally expensive.
ELECTRA, an early attempt to accelerate pre-training, trains a discriminative model that predicts whether each input token was replaced by a generator.
We propose a novel meta-learning framework, MC-BERT, to achieve better efficiency and effectiveness.
arXiv Detail & Related papers (2020-06-10T09:22:19Z) - Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation [84.64004917951547]
Fine-tuning pre-trained language models like BERT has become an effective way in NLP.
In this paper, we improve the fine-tuning of BERT with two effective mechanisms: self-ensemble and self-distillation.
arXiv Detail & Related papers (2020-02-24T16:17:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.