A Marker-based Neural Network System for Extracting Social Determinants
of Health
- URL: http://arxiv.org/abs/2212.12800v1
- Date: Sat, 24 Dec 2022 18:40:23 GMT
- Title: A Marker-based Neural Network System for Extracting Social Determinants
of Health
- Authors: Xingmeng Zhao and Anthony Rios
- Abstract summary: Social determinants of health (SDoH) on patients' healthcare quality and the disparity is well-known.
Many SDoH items are not coded in structured forms in electronic health records.
We explore a multi-stage pipeline involving named entity recognition (NER), relation classification (RC), and text classification methods to extract SDoH information from clinical notes automatically.
- Score: 12.6970199179668
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Objective. The impact of social determinants of health (SDoH) on patients'
healthcare quality and the disparity is well-known. Many SDoH items are not
coded in structured forms in electronic health records. These items are often
captured in free-text clinical notes, but there are limited methods for
automatically extracting them. We explore a multi-stage pipeline involving
named entity recognition (NER), relation classification (RC), and text
classification methods to extract SDoH information from clinical notes
automatically.
Materials and Methods. The study uses the N2C2 Shared Task data, which was
collected from two sources of clinical notes: MIMIC-III and University of
Washington Harborview Medical Centers. It contains 4480 social history sections
with full annotation for twelve SDoHs. In order to handle the issue of
overlapping entities, we developed a novel marker-based NER model. We used it
in a multi-stage pipeline to extract SDoH information from clinical notes.
Results. Our marker-based system outperformed the state-of-the-art span-based
models at handling overlapping entities based on the overall Micro-F1 score
performance. It also achieved state-of-the-art performance compared to the
shared task methods.
Conclusion. The major finding of this study is that the multi-stage pipeline
effectively extracts SDoH information from clinical notes. This approach can
potentially improve the understanding and tracking of SDoHs in clinical
settings. However, error propagation may be an issue, and further research is
needed to improve the extraction of entities with complex semantic meanings and
low-resource entities using external knowledge.
Related papers
- Improving Extraction of Clinical Event Contextual Properties from Electronic Health Records: A Comparative Study [2.0884301753594334]
This study performs a comparative analysis of various natural language models for medical text classification.
BERT outperforms Bi-LSTM models by up to 28% and the baseline BERT model by up to 16% for recall of the minority classes.
arXiv Detail & Related papers (2024-08-30T10:28:49Z) - SNOBERT: A Benchmark for clinical notes entity linking in the SNOMED CT clinical terminology [43.89160296332471]
We propose a method for linking text spans in clinical notes to specific concepts in the SNOMED CT using BERT-based models.
The method consists of two stages: candidate selection and candidate matching. The models were trained on one of the largest publicly available dataset of labeled clinical notes.
arXiv Detail & Related papers (2024-05-25T08:00:44Z) - Multi-task Explainable Skin Lesion Classification [54.76511683427566]
We propose a few-shot-based approach for skin lesions that generalizes well with few labelled data.
The proposed approach comprises a fusion of a segmentation network that acts as an attention module and classification network.
arXiv Detail & Related papers (2023-10-11T05:49:47Z) - Development and validation of a natural language processing algorithm to
pseudonymize documents in the context of a clinical data warehouse [53.797797404164946]
The study highlights the difficulties faced in sharing tools and resources in this domain.
We annotated a corpus of clinical documents according to 12 types of identifying entities.
We build a hybrid system, merging the results of a deep learning model as well as manual rules.
arXiv Detail & Related papers (2023-03-23T17:17:46Z) - Automatically Extracting Information in Medical Dialogue: Expert System
And Attention for Labelling [0.0]
Expert System and Attention for Labelling (ESAL) is a novel model for retrieving features from medical records.
We use mixture of experts and pre-trained BERT to retrieve the semantics of different categories.
In our experiment, ESAL significantly improved the performance of Medical Information Classification.
arXiv Detail & Related papers (2022-11-28T16:49:13Z) - Nested Named Entity Recognition from Medical Texts: An Adaptive Shared
Network Architecture with Attentive CRF [53.55504611255664]
We propose a novel method, referred to as ASAC, to solve the dilemma caused by the nested phenomenon.
The proposed method contains two key modules: the adaptive shared (AS) part and the attentive conditional random field (ACRF) module.
Our model could learn better entity representations by capturing the implicit distinctions and relationships between different categories of entities.
arXiv Detail & Related papers (2022-11-09T09:23:56Z) - Ontology-Driven and Weakly Supervised Rare Disease Identification from
Clinical Notes [13.096008602034086]
Rare diseases are challenging to be identified due to few cases available for machine learning and the need for data annotation from domain experts.
We propose a method using brain and weak supervision, with recent pre-trained contextual representations from Bi-directional Transformers (e.g. BERT)
The weakly supervised approach is proposed to learn a confirmation phenotype model to improve Text-to-UMLS linking, without annotated data from domain experts.
arXiv Detail & Related papers (2022-05-11T17:38:24Z) - Neural networks for Anatomical Therapeutic Chemical (ATC) [83.73971067918333]
We propose combining multiple multi-label classifiers trained on distinct sets of features, including sets extracted from a Bidirectional Long Short-Term Memory Network (BiLSTM)
Experiments demonstrate the power of this approach, which is shown to outperform the best methods reported in the literature.
arXiv Detail & Related papers (2021-01-22T19:49:47Z) - Text Mining to Identify and Extract Novel Disease Treatments From
Unstructured Datasets [56.38623317907416]
We use Google Cloud to transcribe podcast episodes of an NPR radio show.
We then build a pipeline for systematically pre-processing the text.
Our model successfully identified that Omeprazole can help treat heartburn.
arXiv Detail & Related papers (2020-10-22T19:52:49Z) - BiteNet: Bidirectional Temporal Encoder Network to Predict Medical
Outcomes [53.163089893876645]
We propose a novel self-attention mechanism that captures the contextual dependency and temporal relationships within a patient's healthcare journey.
An end-to-end bidirectional temporal encoder network (BiteNet) then learns representations of the patient's journeys.
We have evaluated the effectiveness of our methods on two supervised prediction and two unsupervised clustering tasks with a real-world EHR dataset.
arXiv Detail & Related papers (2020-09-24T00:42:36Z) - Generating SOAP Notes from Doctor-Patient Conversations Using Modular
Summarization Techniques [43.13248746968624]
We introduce the first complete pipelines to leverage deep summarization models to generate SOAP notes.
We propose Cluster2Sent, an algorithm that extracts important utterances relevant to each summary section.
Our results speak to the benefits of structuring summaries into sections and annotating supporting evidence when constructing summarization corpora.
arXiv Detail & Related papers (2020-05-04T19:10:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.