Related papers: Positional Attention for Efficient BERT-Based Named Entity Recognition

Positional Attention for Efficient BERT-Based Named Entity Recognition

URL: http://arxiv.org/abs/2505.01868v1
Date: Sat, 03 May 2025 17:17:05 GMT
Title: Positional Attention for Efficient BERT-Based Named Entity Recognition
Authors: Mo Sun, Siheng Xiong, Yuankai Cai, Bowen Zuo,
Abstract summary: We present a framework for Named Entity Recognition (NER) leveraging the Bidirectional Representations from Transformers (BERT) model in natural language processing (NLP)<n>We propose a cost-efficient approach that integrates positional attention mechanisms into the entity recognition process and enables effective customization using pre-trained parameters.<n>This work contributes to the field by offering a practical solution for reducing the training cost of BERT-based NER systems while maintaining high accuracy.
Score: 1.2345322051083512
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents a framework for Named Entity Recognition (NER) leveraging the Bidirectional Encoder Representations from Transformers (BERT) model in natural language processing (NLP). NER is a fundamental task in NLP with broad applicability across downstream applications. While BERT has established itself as a state-of-the-art model for entity recognition, fine-tuning it from scratch for each new application is computationally expensive and time-consuming. To address this, we propose a cost-efficient approach that integrates positional attention mechanisms into the entity recognition process and enables effective customization using pre-trained parameters. The framework is evaluated on a Kaggle dataset derived from the Groningen Meaning Bank corpus and achieves strong performance with fewer training epochs. This work contributes to the field by offering a practical solution for reducing the training cost of BERT-based NER systems while maintaining high accuracy.

Related papers

ALow-Cost Real-Time Framework for Industrial Action Recognition Using Foundation Models [8.654703129948901]
Action recognition in industrial environments faces persistent challenges due to high deployment costs, poor cross-scenario generalization, and limited real-time performance.<n>We propose a low-cost real-time framework for industrial action recognition using foundation models, denoted as LRIAR, to enhance recognition accuracy and transferability.
arXiv Detail & Related papers (2024-03-13T11:11:59Z)
On Significance of Subword tokenization for Low Resource and Efficient Named Entity Recognition: A case study in Marathi [1.6383036433216434]
We focus on NER for low-resource language and present our case study in the context of the Indian language Marathi. We propose a hybrid approach for efficient NER by integrating a BERT-based subword tokenizer into vanilla CNN/LSTM models. We show that this simple approach of replacing a traditional word-based tokenizer with a BERT-tokenizer brings the accuracy of vanilla single-layer models closer to that of deep pre-trained models like BERT.
arXiv Detail & Related papers (2023-12-03T06:53:53Z)
DPBERT: Efficient Inference for BERT based on Dynamic Planning [11.680840266488884]
Existing input-adaptive inference methods fail to take full advantage of the structure of BERT. We propose Dynamic Planning in BERT, a novel fine-tuning strategy that can accelerate the inference process of BERT. Our method reduces latency to 75% while maintaining 98% accuracy, yielding a better accuracy-speed trade-off compared to state-of-the-art input-adaptive methods.
arXiv Detail & Related papers (2023-07-26T07:18:50Z)
Global Pointer: Novel Efficient Span-based Approach for Named Entity Recognition [7.226094340165499]
Named entity recognition (NER) task aims at identifying entities from a piece of text that belong to predefined semantic types. The state-of-the-art solutions for flat entities NER commonly suffer from capturing the fine-grained semantic information in underlying texts. We propose a novel span-based NER framework, namely Global Pointer (GP), that leverages the relative positions through a multiplicative attention mechanism.
arXiv Detail & Related papers (2022-08-05T09:19:46Z)
BiBERT: Accurate Fully Binarized BERT [69.35727280997617]
BiBERT is an accurate fully binarized BERT to eliminate the performance bottlenecks. Our method yields impressive 56.3 times and 31.2 times saving on FLOPs and model size.
arXiv Detail & Related papers (2022-03-12T09:46:13Z)
Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training [66.80558875393565]
We study the problem of training named entity recognition (NER) models using only distantly-labeled data. We propose a noise-robust learning scheme comprised of a new loss function and a noisy label removal step. Our method achieves superior performance, outperforming existing distantly-supervised NER models by significant margins.
arXiv Detail & Related papers (2021-09-10T17:19:56Z)
AutoTriggER: Label-Efficient and Robust Named Entity Recognition with Auxiliary Trigger Extraction [54.20039200180071]
We present a novel framework to improve NER performance by automatically generating and leveraging entity triggers'' Our framework leverages post-hoc explanation to generate rationales and strengthens a model's prior knowledge using an embedding technique. AutoTriggER shows strong label-efficiency, is capable of generalizing to unseen entities, and outperforms the RoBERTa-CRF baseline by nearly 0.5 F1 points on average.
arXiv Detail & Related papers (2021-09-10T08:11:56Z)
Coarse-to-Fine Pre-training for Named Entity Recognition [26.00489191164784]
We propose a NER-specific pre-training framework to in-ject coarse-to-fine automatically mined entityknowledge into pre-trained models. Our framework achieves significant improvements against several pre-trained base-lines, establishing the new state-of-the-art per-formance on three benchmarks.
arXiv Detail & Related papers (2020-10-16T07:39:20Z)
BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision [49.42215511723874]
We propose a new computational framework -- BOND -- to improve the prediction performance of NER models. Specifically, we propose a two-stage training algorithm: In the first stage, we adapt the pre-trained language model to the NER tasks using the distant labels. In the second stage, we drop the distant labels, and propose a self-training approach to further improve the model performance.
arXiv Detail & Related papers (2020-06-28T04:55:39Z)
MC-BERT: Efficient Language Pre-Training via a Meta Controller [96.68140474547602]
Large-scale pre-training is computationally expensive. ELECTRA, an early attempt to accelerate pre-training, trains a discriminative model that predicts whether each input token was replaced by a generator. We propose a novel meta-learning framework, MC-BERT, to achieve better efficiency and effectiveness.
arXiv Detail & Related papers (2020-06-10T09:22:19Z)
Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation [84.64004917951547]
Fine-tuning pre-trained language models like BERT has become an effective way in NLP. In this paper, we improve the fine-tuning of BERT with two effective mechanisms: self-ensemble and self-distillation.
arXiv Detail & Related papers (2020-02-24T16:17:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.