Related papers: Neural Text Classification and Stacked Heterogeneous Embeddings for Named Entity Recognition in SMM4H 2021

Neural Text Classification and Stacked Heterogeneous Embeddings for Named Entity Recognition in SMM4H 2021

URL: http://arxiv.org/abs/2106.05823v2
Date: Fri, 11 Jun 2021 13:23:15 GMT
Title: Neural Text Classification and Stacked Heterogeneous Embeddings for Named Entity Recognition in SMM4H 2021
Authors: Usama Yaseen, Stefan Langer
Abstract summary: We addressed Named Entity Recognition (NER) and Text Classification. To address NER we explored BiLSTM-CRF with Stacked Heterogeneous Embeddings and linguistic features. Our proposed approaches can be generalized to different languages and we have shown its effectiveness for English and Spanish.
Score: 1.195496689595016
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper presents our findings from participating in the SMM4H Shared Task 2021. We addressed Named Entity Recognition (NER) and Text Classification. To address NER we explored BiLSTM-CRF with Stacked Heterogeneous Embeddings and linguistic features. We investigated various machine learning algorithms (logistic regression, Support Vector Machine (SVM) and Neural Networks) to address text classification. Our proposed approaches can be generalized to different languages and we have shown its effectiveness for English and Spanish. Our text classification submissions (team:MIC-NLP) have achieved competitive performance with F1-score of $0.46$ and $0.90$ on ADE Classification (Task 1a) and Profession Classification (Task 7a) respectively. In the case of NER, our submissions scored F1-score of $0.50$ and $0.82$ on ADE Span Detection (Task 1b) and Profession Span detection (Task 7b) respectively.

Related papers

myNER: Contextualized Burmese Named Entity Recognition with Bidirectional LSTM and fastText Embeddings via Joint Training with POS Tagging [0.0]
We introduce myNER, a novel word-level NER corpus featuring a 7-tag annotation scheme. We also conduct a comprehensive evaluation of NER models, including Conditional Random Fields (CRF), Bidirectional LSTM (BiLSTM)-CRF, and their combinations with fastText embeddings. Experiments reveal the effectiveness of contextualized word embeddings and the impact of joint training with POS tagging.
arXiv Detail & Related papers (2025-04-05T03:13:33Z)
Named Entity Recognition via Machine Reading Comprehension: A Multi-Task Learning Approach [50.12455129619845]
Named Entity Recognition (NER) aims to extract and classify entity mentions in the text into pre-defined types. We propose to incorporate the label dependencies among entity types into a multi-task learning framework for better MRC-based NER.
arXiv Detail & Related papers (2023-09-20T03:15:05Z)
Enhancing Pashto Text Classification using Language Processing Techniques for Single And Multi-Label Analysis [0.0]
This study aims to establish an automated classification system for Pashto text. The study achieved an average testing accuracy rate of 94%. The use of pre-trained language representation models, such as DistilBERT, showed promising results.
arXiv Detail & Related papers (2023-05-04T23:11:31Z)
Neural Coreference Resolution based on Reinforcement Learning [53.73316523766183]
Coreference resolution systems need to solve two subtasks. One task is to detect all of the potential mentions, the other is to learn the linking of an antecedent for each possible mention. We propose a reinforcement learning actor-critic-based neural coreference resolution system.
arXiv Detail & Related papers (2022-12-18T07:36:35Z)
NEAR: Named Entity and Attribute Recognition of clinical concepts [2.4278445972594525]
This research aims to contribute to the area of detecting entities and their corresponding attributes by modelling the NER task as a supervised, multi-label tagging problem. We propose 3 architectures to achieve this multi-label entity tagging: BiLSTM n-CRF, BiLSTM-CRF-Smax-TF and BiLSTM n-CRF-TF. Our different models obtain best NER F1 scores of 0. 894 and 0.808 on the i2b2 2010/VA and i2b2 2012 datasets respectively.
arXiv Detail & Related papers (2022-08-30T01:46:11Z)
RuArg-2022: Argument Mining Evaluation [69.87149207721035]
This paper is a report of the organizers on the first competition of argumentation analysis systems dealing with Russian language texts. A corpus containing 9,550 sentences (comments on social media posts) on three topics related to the COVID-19 pandemic was prepared. The system that won the first place in both tasks used the NLI (Natural Language Inference) variant of the BERT architecture.
arXiv Detail & Related papers (2022-06-18T17:13:37Z)
Detecting Handwritten Mathematical Terms with Sensor Based Data [71.84852429039881]
We propose a solution to the UbiComp 2021 Challenge by Stabilo in which handwritten mathematical terms are supposed to be automatically classified. The input data set contains data of different writers, with label strings constructed from a total of 15 different possible characters.
arXiv Detail & Related papers (2021-09-12T19:33:34Z)
An Attention Ensemble Approach for Efficient Text Classification of Indian Languages [0.0]
This paper focuses on the coarse-grained technical domain identification of short text documents in Marathi, a Devanagari script-based Indian language. A hybrid CNN-BiLSTM attention ensemble model is proposed that competently combines the intermediate sentence representations generated by the convolutional neural network and the bidirectional long short-term memory, leading to efficient text classification. Experimental results show that the proposed model outperforms various baseline machine learning and deep learning models in the given task, giving the best validation accuracy of 89.57% and f1-score of 0.8875.
arXiv Detail & Related papers (2021-02-20T07:31:38Z)
Arabic Speech Recognition by End-to-End, Modular Systems and Human [56.96327247226586]
We perform a comprehensive benchmarking for end-to-end transformer ASR, modular HMM-DNN ASR, and human speech recognition. For ASR the end-to-end work led to 12.5%, 27.5%, 23.8% WER; a new performance milestone for the MGB2, MGB3, and MGB5 challenges respectively. Our results suggest that human performance in the Arabic language is still considerably better than the machine with an absolute WER gap of 3.6% on average.
arXiv Detail & Related papers (2021-01-21T05:55:29Z)
Explicit Alignment Objectives for Multilingual Bidirectional Encoders [111.65322283420805]
We present a new method for learning multilingual encoders, AMBER (Aligned Multilingual Bi-directional EncodeR) AMBER is trained on additional parallel data using two explicit alignment objectives that align the multilingual representations at different granularities. Experimental results show that AMBER obtains gains of up to 1.1 average F1 score on sequence tagging and up to 27.3 average accuracy on retrieval over the XLMR-large model.
arXiv Detail & Related papers (2020-10-15T18:34:13Z)
Relation Detection for Indonesian Language using Deep Neural Network -- Support Vector Machine [0.0]
We employ neural network to do relation detection between two named entities for Indonesian Language. We used feature such as word embedding, position embedding, POS-Tag embedding, and character embedding. The best result is 0.8083 on F1-Score using Convolutional Layer as front-part and SVM as back-part.
arXiv Detail & Related papers (2020-09-12T01:45:08Z)
NLNDE: Enhancing Neural Sequence Taggers with Attention and Noisy Channel for Robust Pharmacological Entity Detection [11.98821166621488]
We describe the system with which we participated in the first subtrack of the PharmaCoNER competition of the BioNLP Open Shared Tasks 2019. Our system achieves promising results, especially by combining the different techniques, and reaches up to 88.6% F1 in the competition.
arXiv Detail & Related papers (2020-07-02T11:17:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.