NEAR: Named Entity and Attribute Recognition of clinical concepts
- URL: http://arxiv.org/abs/2208.13949v1
- Date: Tue, 30 Aug 2022 01:46:11 GMT
- Title: NEAR: Named Entity and Attribute Recognition of clinical concepts
- Authors: Namrata Nath, Sang-Heon Lee, Ivan Lee
- Abstract summary: This research aims to contribute to the area of detecting entities and their corresponding attributes by modelling the NER task as a supervised, multi-label tagging problem.
We propose 3 architectures to achieve this multi-label entity tagging: BiLSTM n-CRF, BiLSTM-CRF-Smax-TF and BiLSTM n-CRF-TF.
Our different models obtain best NER F1 scores of 0. 894 and 0.808 on the i2b2 2010/VA and i2b2 2012 datasets respectively.
- Score: 2.4278445972594525
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Named Entity Recognition (NER) or the extraction of concepts from clinical
text is the task of identifying entities in text and slotting them into
categories such as problems, treatments, tests, clinical departments,
occurrences (such as admission and discharge) and others. NER forms a critical
component of processing and leveraging unstructured data from Electronic Health
Records (EHR). While identifying the spans and categories of concepts is itself
a challenging task, these entities could also have attributes such as negation
that pivot their meanings implied to the consumers of the named entities. There
has been little research dedicated to identifying the entities and their
qualifying attributes together. This research hopes to contribute to the area
of detecting entities and their corresponding attributes by modelling the NER
task as a supervised, multi-label tagging problem with each of the attributes
assigned tagging sequence labels. In this paper, we propose 3 architectures to
achieve this multi-label entity tagging: BiLSTM n-CRF, BiLSTM-CRF-Smax-TF and
BiLSTM n-CRF-TF. We evaluate these methods on the 2010 i2b2/VA and the i2b2
2012 shared task datasets. Our different models obtain best NER F1 scores of 0.
894 and 0.808 on the i2b2 2010/VA and i2b2 2012 respectively. The highest span
based micro-averaged F1 polarity scores obtained were 0.832 and 0.836 on the
i2b2 2010/VA and i2b2 2012 datasets respectively, and the highest
macro-averaged F1 polarity scores obtained were 0.924 and 0.888 respectively.
The modality studies conducted on i2b2 2012 dataset revealed high scores of
0.818 and 0.501 for span based micro-averaged F1 and macro-averaged F1
respectively.
Related papers
- myNER: Contextualized Burmese Named Entity Recognition with Bidirectional LSTM and fastText Embeddings via Joint Training with POS Tagging [0.0]
We introduce myNER, a novel word-level NER corpus featuring a 7-tag annotation scheme.
We also conduct a comprehensive evaluation of NER models, including Conditional Random Fields (CRF), Bidirectional LSTM (BiLSTM)-CRF, and their combinations with fastText embeddings.
Experiments reveal the effectiveness of contextualized word embeddings and the impact of joint training with POS tagging.
arXiv Detail & Related papers (2025-04-05T03:13:33Z) - UniCell: Universal Cell Nucleus Classification via Prompt Learning [76.11864242047074]
We propose a universal cell nucleus classification framework (UniCell)
It employs a novel prompt learning mechanism to uniformly predict the corresponding categories of pathological images from different dataset domains.
In particular, our framework adopts an end-to-end architecture for nuclei detection and classification, and utilizes flexible prediction heads for adapting various datasets.
arXiv Detail & Related papers (2024-02-20T11:50:27Z) - A Federated Learning Framework for Stenosis Detection [70.27581181445329]
This study explores the use of Federated Learning (FL) for stenosis detection in coronary angiography images (CA)
Two heterogeneous datasets from two institutions were considered: dataset 1 includes 1219 images from 200 patients, which we acquired at the Ospedale Riuniti of Ancona (Italy)
dataset 2 includes 7492 sequential images from 90 patients from a previous study available in the literature.
arXiv Detail & Related papers (2023-10-30T11:13:40Z) - EMAHA-DB1: A New Upper Limb sEMG Dataset for Classification of
Activities of Daily Living [8.854624631197941]
The dataset is acquired from 25 able-bodied subjects while performing 22 activities.
The state-of-theart classification accuracy on five FAABOS categories is 83:21%.
The developed dataset can be used as a benchmark for various classification methods.
arXiv Detail & Related papers (2023-01-09T13:20:45Z) - Exploring the Value of Pre-trained Language Models for Clinical Named
Entity Recognition [6.917786124918387]
We compare Transformer models that are trained from scratch to fine-tuned BERT-based LLMs.
We examine the impact of an additional CRF layer on such models to encourage contextual learning.
arXiv Detail & Related papers (2022-10-23T16:27:31Z) - Deeper Clinical Document Understanding Using Relation Extraction [0.0]
We propose a text mining framework comprising of Named Entity Recognition (NER) and Relation Extraction (RE) models.
We introduce two new RE model architectures -- an accuracy-optimized one based on BioBERT and a speed-optimized one utilizing crafted features over a Fully Connected Neural Network (FCNN)
We show two practical applications of this framework -- for building a biomedical knowledge graph and for improving the accuracy of mapping entities to clinical codes.
arXiv Detail & Related papers (2021-12-25T17:14:13Z) - QUEACO: Borrowing Treasures from Weakly-labeled Behavior Data for Query
Attribute Value Extraction [57.56700153507383]
This paper proposes a unified query attribute value extraction system in e-commerce search named QUEACO.
For the NER phase, QUEACO adopts a novel teacher-student network, where a teacher network that is trained on the strongly-labeled data generates pseudo-labels.
For the AVN phase, we also leverage the weakly-labeled query-to-attribute behavior data to normalize surface form attribute values from queries into canonical forms from products.
arXiv Detail & Related papers (2021-08-19T03:24:23Z) - Neural Text Classification and Stacked Heterogeneous Embeddings for
Named Entity Recognition in SMM4H 2021 [1.195496689595016]
We addressed Named Entity Recognition (NER) and Text Classification.
To address NER we explored BiLSTM-CRF with Stacked Heterogeneous Embeddings and linguistic features.
Our proposed approaches can be generalized to different languages and we have shown its effectiveness for English and Spanish.
arXiv Detail & Related papers (2021-06-10T15:43:21Z) - Deep learning-based COVID-19 pneumonia classification using chest CT
images: model generalizability [54.86482395312936]
Deep learning (DL) classification models were trained to identify COVID-19-positive patients on 3D computed tomography (CT) datasets from different countries.
We trained nine identical DL-based classification models by using combinations of the datasets with a 72% train, 8% validation, and 20% test data split.
The models trained on multiple datasets and evaluated on a test set from one of the datasets used for training performed better.
arXiv Detail & Related papers (2021-02-18T21:14:52Z) - EEG-Inception: An Accurate and Robust End-to-End Neural Network for
EEG-based Motor Imagery Classification [123.93460670568554]
This paper proposes a novel convolutional neural network (CNN) architecture for accurate and robust EEG-based motor imagery (MI) classification.
The proposed CNN model, namely EEG-Inception, is built on the backbone of the Inception-Time network.
The proposed network is an end-to-end classification, as it takes the raw EEG signals as the input and does not require complex EEG signal-preprocessing.
arXiv Detail & Related papers (2021-01-24T19:03:10Z) - Classify and Generate Reciprocally: Simultaneous Positive-Unlabelled
Learning and Conditional Generation with Extra Data [77.31213472792088]
The scarcity of class-labeled data is a ubiquitous bottleneck in many machine learning problems.
We address this problem by leveraging Positive-Unlabeled(PU) classification and the conditional generation with extra unlabeled data.
We present a novel training framework to jointly target both PU classification and conditional generation when exposed to extra data.
arXiv Detail & Related papers (2020-06-14T08:27:40Z) - Adaptive Name Entity Recognition under Highly Unbalanced Data [5.575448433529451]
We present our experiments on a neural architecture composed of a Conditional Random Field (CRF) layer stacked on top of a Bi-directional LSTM (BI-LSTM) layer for solving NER tasks.
We introduce an add-on classification model to split sentences into two different sets: Weak and Strong classes and then designing a couple of Bi-LSTM-CRF models properly to optimize performance on each set.
arXiv Detail & Related papers (2020-03-10T06:56:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.