myNER: Contextualized Burmese Named Entity Recognition with Bidirectional LSTM and fastText Embeddings via Joint Training with POS Tagging
- URL: http://arxiv.org/abs/2504.04038v1
- Date: Sat, 05 Apr 2025 03:13:33 GMT
- Title: myNER: Contextualized Burmese Named Entity Recognition with Bidirectional LSTM and fastText Embeddings via Joint Training with POS Tagging
- Authors: Kaung Lwin Thant, Kwankamol Nongpong, Ye Kyaw Thu, Thura Aung, Khaing Hsu Wai, Thazin Myint Oo,
- Abstract summary: We introduce myNER, a novel word-level NER corpus featuring a 7-tag annotation scheme.<n>We also conduct a comprehensive evaluation of NER models, including Conditional Random Fields (CRF), Bidirectional LSTM (BiLSTM)-CRF, and their combinations with fastText embeddings.<n>Experiments reveal the effectiveness of contextualized word embeddings and the impact of joint training with POS tagging.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Named Entity Recognition (NER) involves identifying and categorizing named entities within textual data. Despite its significance, NER research has often overlooked low-resource languages like Myanmar (Burmese), primarily due to the lack of publicly available annotated datasets. To address this, we introduce myNER, a novel word-level NER corpus featuring a 7-tag annotation scheme, enriched with Part-of-Speech (POS) tagging to provide additional syntactic information. Alongside the corpus, we conduct a comprehensive evaluation of NER models, including Conditional Random Fields (CRF), Bidirectional LSTM (BiLSTM)-CRF, and their combinations with fastText embeddings in different settings. Our experiments reveal the effectiveness of contextualized word embeddings and the impact of joint training with POS tagging, demonstrating significant performance improvements across models. The traditional CRF joint-task model with fastText embeddings as a feature achieved the best result, with a 0.9818 accuracy and 0.9811 weighted F1 score with 0.7429 macro F1 score. BiLSTM-CRF with fine-tuned fastText embeddings gets the best result of 0.9791 accuracy and 0.9776 weighted F1 score with 0.7395 macro F1 score.
Related papers
- Binary Token-Level Classification with DeBERTa for All-Type MWE Identification: A Lightweight Approach with Linguistic Enhancement [1.8429656136522097]
We present a comprehensive approach for multiword expression (MWE) identification that combines binary token-level classification, linguistic feature integration, and data augmentation.<n>Our DeBERTa-v3-large model achieves 69.8% F1 on the CoAM dataset, surpassing the best results (Qwen-72B, 57.8% F1) on this dataset by 12 points while using 165x fewer parameters.
arXiv Detail & Related papers (2026-01-27T08:42:54Z) - Using Large Language Model for End-to-End Chinese ASR and NER [35.876792804001646]
We present an encoder-decoder architecture that incorporates speech features through cross-attention.
We compare these two approaches using Chinese automatic speech recognition (ASR) and name entity recognition (NER) tasks.
Our experiments reveal that encoder-decoder architecture outperforms decoder-only architecture with a short context.
arXiv Detail & Related papers (2024-01-21T03:15:05Z) - Named Entity Recognition via Machine Reading Comprehension: A Multi-Task
Learning Approach [50.12455129619845]
Named Entity Recognition (NER) aims to extract and classify entity mentions in the text into pre-defined types.
We propose to incorporate the label dependencies among entity types into a multi-task learning framework for better MRC-based NER.
arXiv Detail & Related papers (2023-09-20T03:15:05Z) - RGAT: A Deeper Look into Syntactic Dependency Information for
Coreference Resolution [8.017036537163008]
We propose an end-to-end resolution that combines pre-trained BERT with a Syntactic Relation Graph Attention Network (RGAT)
In particular, the RGAT model is first proposed, then used to understand the syntactic dependency graph and learn better task-specific syntactic embeddings.
An integrated architecture incorporating BERT embeddings and syntactic embeddings is constructed to generate blending representations for the downstream task.
arXiv Detail & Related papers (2023-09-10T09:46:38Z) - A Global Context Mechanism for Sequence Labeling [3.237003512894164]
Global sentence information is crucial for sequence labeling tasks, where each word in a sentence must be assigned a label.<n>Previous work has proposed various RNN variants to integrate global sentence information into word representations.<n>We introduce a simple yet effective mechanism that addresses these limitations.<n>Our approach efficiently supplements global sentence information for both BiLSTM and transformer-based models.
arXiv Detail & Related papers (2023-05-31T15:05:25Z) - Preserving Knowledge Invariance: Rethinking Robustness Evaluation of Open Information Extraction [49.15931834209624]
We present the first benchmark that simulates the evaluation of open information extraction models in the real world.<n>We design and annotate a large-scale testbed in which each example is a knowledge-invariant clique.<n>By further elaborating the robustness metric, a model is judged to be robust if its performance is consistently accurate on the overall cliques.
arXiv Detail & Related papers (2023-05-23T12:05:09Z) - BanglaCoNER: Towards Robust Bangla Complex Named Entity Recognition [0.0]
We present the winning solution of Bangla Complex Named Entity Recognition Challenge.
The dataset consisted of 15300 sentences for training and 800 sentences for validation, in the.conll format.
Our findings also demonstrate the efficacy of Deep Learning models such as BanglaBERT for NER in Bangla language.
arXiv Detail & Related papers (2023-03-16T13:31:31Z) - Enriching Relation Extraction with OpenIE [70.52564277675056]
Relation extraction (RE) is a sub-discipline of information extraction (IE)
In this work, we explore how recent approaches for open information extraction (OpenIE) may help to improve the task of RE.
Our experiments over two annotated corpora, KnowledgeNet and FewRel, demonstrate the improved accuracy of our enriched models.
arXiv Detail & Related papers (2022-12-19T11:26:23Z) - ConNER: Consistency Training for Cross-lingual Named Entity Recognition [96.84391089120847]
Cross-lingual named entity recognition suffers from data scarcity in the target languages.
We propose ConNER as a novel consistency training framework for cross-lingual NER.
arXiv Detail & Related papers (2022-11-17T07:57:54Z) - CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual
Labeled Sequence Translation [113.99145386490639]
Cross-lingual NER can transfer knowledge between languages via aligned cross-lingual representations or machine translation results.
We propose a Cross-lingual Entity Projection framework (CROP) to enable zero-shot cross-lingual NER.
We adopt a multilingual labeled sequence translation model to project the tagged sequence back to the target language and label the target raw sentence.
arXiv Detail & Related papers (2022-10-13T13:32:36Z) - NEAR: Named Entity and Attribute Recognition of clinical concepts [2.4278445972594525]
This research aims to contribute to the area of detecting entities and their corresponding attributes by modelling the NER task as a supervised, multi-label tagging problem.
We propose 3 architectures to achieve this multi-label entity tagging: BiLSTM n-CRF, BiLSTM-CRF-Smax-TF and BiLSTM n-CRF-TF.
Our different models obtain best NER F1 scores of 0. 894 and 0.808 on the i2b2 2010/VA and i2b2 2012 datasets respectively.
arXiv Detail & Related papers (2022-08-30T01:46:11Z) - Generalized Funnelling: Ensemble Learning and Heterogeneous Document
Embeddings for Cross-Lingual Text Classification [78.83284164605473]
emphFunnelling (Fun) is a recently proposed method for cross-lingual text classification.
We describe emphGeneralized Funnelling (gFun) as a generalization of Fun.
We show that gFun substantially improves over Fun and over state-of-the-art baselines.
arXiv Detail & Related papers (2021-09-17T23:33:04Z) - Recognizing Chinese Judicial Named Entity using BiLSTM-CRF [10.676125626144142]
We propose a deep learning-based method named BiLSTM-CRF which consists of bi-directional long short-term memory (BiLSTM) and conditional random fields (CRF)
To validate our method, we perform experiments on judgment documents including commutation, parole and temporary service outside prison, which is acquired from China Judgments Online.
Experimental results achieve the accuracy of 0.876, recall of 0.856 and F1 score of 0.855, which suggests the superiority of the proposed BiLSTM-CRF with Adam.
arXiv Detail & Related papers (2020-05-31T08:13:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.