Neural Text Classification and Stacked Heterogeneous Embeddings for
Named Entity Recognition in SMM4H 2021
- URL: http://arxiv.org/abs/2106.05823v2
- Date: Fri, 11 Jun 2021 13:23:15 GMT
- Title: Neural Text Classification and Stacked Heterogeneous Embeddings for
Named Entity Recognition in SMM4H 2021
- Authors: Usama Yaseen, Stefan Langer
- Abstract summary: We addressed Named Entity Recognition (NER) and Text Classification.
To address NER we explored BiLSTM-CRF with Stacked Heterogeneous Embeddings and linguistic features.
Our proposed approaches can be generalized to different languages and we have shown its effectiveness for English and Spanish.
- Score: 1.195496689595016
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents our findings from participating in the SMM4H Shared Task
2021. We addressed Named Entity Recognition (NER) and Text Classification. To
address NER we explored BiLSTM-CRF with Stacked Heterogeneous Embeddings and
linguistic features. We investigated various machine learning algorithms
(logistic regression, Support Vector Machine (SVM) and Neural Networks) to
address text classification. Our proposed approaches can be generalized to
different languages and we have shown its effectiveness for English and
Spanish. Our text classification submissions (team:MIC-NLP) have achieved
competitive performance with F1-score of $0.46$ and $0.90$ on ADE
Classification (Task 1a) and Profession Classification (Task 7a) respectively.
In the case of NER, our submissions scored F1-score of $0.50$ and $0.82$ on ADE
Span Detection (Task 1b) and Profession Span detection (Task 7b) respectively.
Related papers
- myNER: Contextualized Burmese Named Entity Recognition with Bidirectional LSTM and fastText Embeddings via Joint Training with POS Tagging [0.0]
We introduce myNER, a novel word-level NER corpus featuring a 7-tag annotation scheme.
We also conduct a comprehensive evaluation of NER models, including Conditional Random Fields (CRF), Bidirectional LSTM (BiLSTM)-CRF, and their combinations with fastText embeddings.
Experiments reveal the effectiveness of contextualized word embeddings and the impact of joint training with POS tagging.
arXiv Detail & Related papers (2025-04-05T03:13:33Z) - Named Entity Recognition via Machine Reading Comprehension: A Multi-Task
Learning Approach [50.12455129619845]
Named Entity Recognition (NER) aims to extract and classify entity mentions in the text into pre-defined types.
We propose to incorporate the label dependencies among entity types into a multi-task learning framework for better MRC-based NER.
arXiv Detail & Related papers (2023-09-20T03:15:05Z) - Enhancing Pashto Text Classification using Language Processing
Techniques for Single And Multi-Label Analysis [0.0]
This study aims to establish an automated classification system for Pashto text.
The study achieved an average testing accuracy rate of 94%.
The use of pre-trained language representation models, such as DistilBERT, showed promising results.
arXiv Detail & Related papers (2023-05-04T23:11:31Z) - Neural Coreference Resolution based on Reinforcement Learning [53.73316523766183]
Coreference resolution systems need to solve two subtasks.
One task is to detect all of the potential mentions, the other is to learn the linking of an antecedent for each possible mention.
We propose a reinforcement learning actor-critic-based neural coreference resolution system.
arXiv Detail & Related papers (2022-12-18T07:36:35Z) - NEAR: Named Entity and Attribute Recognition of clinical concepts [2.4278445972594525]
This research aims to contribute to the area of detecting entities and their corresponding attributes by modelling the NER task as a supervised, multi-label tagging problem.
We propose 3 architectures to achieve this multi-label entity tagging: BiLSTM n-CRF, BiLSTM-CRF-Smax-TF and BiLSTM n-CRF-TF.
Our different models obtain best NER F1 scores of 0. 894 and 0.808 on the i2b2 2010/VA and i2b2 2012 datasets respectively.
arXiv Detail & Related papers (2022-08-30T01:46:11Z) - RuArg-2022: Argument Mining Evaluation [69.87149207721035]
This paper is a report of the organizers on the first competition of argumentation analysis systems dealing with Russian language texts.
A corpus containing 9,550 sentences (comments on social media posts) on three topics related to the COVID-19 pandemic was prepared.
The system that won the first place in both tasks used the NLI (Natural Language Inference) variant of the BERT architecture.
arXiv Detail & Related papers (2022-06-18T17:13:37Z) - Detecting Handwritten Mathematical Terms with Sensor Based Data [71.84852429039881]
We propose a solution to the UbiComp 2021 Challenge by Stabilo in which handwritten mathematical terms are supposed to be automatically classified.
The input data set contains data of different writers, with label strings constructed from a total of 15 different possible characters.
arXiv Detail & Related papers (2021-09-12T19:33:34Z) - An Attention Ensemble Approach for Efficient Text Classification of
Indian Languages [0.0]
This paper focuses on the coarse-grained technical domain identification of short text documents in Marathi, a Devanagari script-based Indian language.
A hybrid CNN-BiLSTM attention ensemble model is proposed that competently combines the intermediate sentence representations generated by the convolutional neural network and the bidirectional long short-term memory, leading to efficient text classification.
Experimental results show that the proposed model outperforms various baseline machine learning and deep learning models in the given task, giving the best validation accuracy of 89.57% and f1-score of 0.8875.
arXiv Detail & Related papers (2021-02-20T07:31:38Z) - Arabic Speech Recognition by End-to-End, Modular Systems and Human [56.96327247226586]
We perform a comprehensive benchmarking for end-to-end transformer ASR, modular HMM-DNN ASR, and human speech recognition.
For ASR the end-to-end work led to 12.5%, 27.5%, 23.8% WER; a new performance milestone for the MGB2, MGB3, and MGB5 challenges respectively.
Our results suggest that human performance in the Arabic language is still considerably better than the machine with an absolute WER gap of 3.6% on average.
arXiv Detail & Related papers (2021-01-21T05:55:29Z) - Explicit Alignment Objectives for Multilingual Bidirectional Encoders [111.65322283420805]
We present a new method for learning multilingual encoders, AMBER (Aligned Multilingual Bi-directional EncodeR)
AMBER is trained on additional parallel data using two explicit alignment objectives that align the multilingual representations at different granularities.
Experimental results show that AMBER obtains gains of up to 1.1 average F1 score on sequence tagging and up to 27.3 average accuracy on retrieval over the XLMR-large model.
arXiv Detail & Related papers (2020-10-15T18:34:13Z) - Relation Detection for Indonesian Language using Deep Neural Network --
Support Vector Machine [0.0]
We employ neural network to do relation detection between two named entities for Indonesian Language.
We used feature such as word embedding, position embedding, POS-Tag embedding, and character embedding.
The best result is 0.8083 on F1-Score using Convolutional Layer as front-part and SVM as back-part.
arXiv Detail & Related papers (2020-09-12T01:45:08Z) - NLNDE: Enhancing Neural Sequence Taggers with Attention and Noisy
Channel for Robust Pharmacological Entity Detection [11.98821166621488]
We describe the system with which we participated in the first subtrack of the PharmaCoNER competition of the BioNLP Open Shared Tasks 2019.
Our system achieves promising results, especially by combining the different techniques, and reaches up to 88.6% F1 in the competition.
arXiv Detail & Related papers (2020-07-02T11:17:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.