Mining Adverse Drug Reactions from Unstructured Mediums at Scale
- URL: http://arxiv.org/abs/2201.01405v2
- Date: Thu, 6 Jan 2022 01:47:09 GMT
- Title: Mining Adverse Drug Reactions from Unstructured Mediums at Scale
- Authors: Hasham Ul Haq, Veysel Kocaman, David Talby
- Abstract summary: Adverse drug reactions / events (ADR/ADE) have a major impact on patient health and health care costs.
Most ADR's are not reported via formal channels, but they are often documented in unstructured conversations.
We propose a natural language processing (NLP) solution that detects ADR's in such unstructured free-text conversations.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adverse drug reactions / events (ADR/ADE) have a major impact on patient
health and health care costs. Detecting ADR's as early as possible and sharing
them with regulators, pharma companies, and healthcare providers can prevent
morbidity and save many lives. While most ADR's are not reported via formal
channels, they are often documented in a variety of unstructured conversations
such as social media posts by patients, customer support call transcripts, or
CRM notes of meetings between healthcare providers and pharma sales reps. In
this paper, we propose a natural language processing (NLP) solution that
detects ADR's in such unstructured free-text conversations, which improves on
previous work in three ways. First, a new Named Entity Recognition (NER) model
obtains new state-of-the-art accuracy for ADR and Drug entity extraction on the
ADE, CADEC, and SMM4H benchmark datasets (91.75%, 78.76%, and 83.41% F1 scores
respectively). Second, two new Relation Extraction (RE) models are introduced -
one based on BioBERT while the other utilizing crafted features over a Fully
Connected Neural Network (FCNN) - are shown to perform on par with existing
state-of-the-art models, and outperform them when trained with a supplementary
clinician-annotated RE dataset. Third, a new text classification model, for
deciding if a conversation includes an ADR, obtains new state-of-the-art
accuracy on the CADEC dataset (86.69% F1 score). The complete solution is
implemented as a unified NLP pipeline in a production-grade library built on
top of Apache Spark, making it natively scalable and able to process millions
of batch or streaming records on commodity clusters.
Related papers
- HyPoradise: An Open Baseline for Generative Speech Recognition with
Large Language Models [81.56455625624041]
We introduce the first open-source benchmark to utilize external large language models (LLMs) for ASR error correction.
The proposed benchmark contains a novel dataset, HyPoradise (HP), encompassing more than 334,000 pairs of N-best hypotheses.
LLMs with reasonable prompt and its generative capability can even correct those tokens that are missing in N-best list.
arXiv Detail & Related papers (2023-09-27T14:44:10Z) - KESDT: knowledge enhanced shallow and deep Transformer for detecting
adverse drug reactions [14.095117843726511]
We propose the Knowledge Enhanced Shallow and Deep Transformer(KESDT) model for ADR detection.
To cope with the first issue, we incorporate the domain keywords into the Transformer model through a shallow fusion manner.
To overcome the low annotated data, we integrate the synonym sets into the Transformer model through a deep fusion manner.
arXiv Detail & Related papers (2023-08-18T06:10:11Z) - ADRNet: A Generalized Collaborative Filtering Framework Combining
Clinical and Non-Clinical Data for Adverse Drug Reaction Prediction [49.56476929112382]
Adverse drug reaction (ADR) prediction plays a crucial role in both health care and drug discovery.
We propose ADRNet, a generalized collaborative filtering framework combining clinical and non-clinical data for drug-ADR prediction.
arXiv Detail & Related papers (2023-08-03T11:28:12Z) - A Marker-based Neural Network System for Extracting Social Determinants
of Health [12.6970199179668]
Social determinants of health (SDoH) on patients' healthcare quality and the disparity is well-known.
Many SDoH items are not coded in structured forms in electronic health records.
We explore a multi-stage pipeline involving named entity recognition (NER), relation classification (RC), and text classification methods to extract SDoH information from clinical notes automatically.
arXiv Detail & Related papers (2022-12-24T18:40:23Z) - Enriching Relation Extraction with OpenIE [70.52564277675056]
Relation extraction (RE) is a sub-discipline of information extraction (IE)
In this work, we explore how recent approaches for open information extraction (OpenIE) may help to improve the task of RE.
Our experiments over two annotated corpora, KnowledgeNet and FewRel, demonstrate the improved accuracy of our enriched models.
arXiv Detail & Related papers (2022-12-19T11:26:23Z) - Deeper Clinical Document Understanding Using Relation Extraction [0.0]
We propose a text mining framework comprising of Named Entity Recognition (NER) and Relation Extraction (RE) models.
We introduce two new RE model architectures -- an accuracy-optimized one based on BioBERT and a speed-optimized one utilizing crafted features over a Fully Connected Neural Network (FCNN)
We show two practical applications of this framework -- for building a biomedical knowledge graph and for improving the accuracy of mapping entities to clinical codes.
arXiv Detail & Related papers (2021-12-25T17:14:13Z) - Text Mining to Identify and Extract Novel Disease Treatments From
Unstructured Datasets [56.38623317907416]
We use Google Cloud to transcribe podcast episodes of an NPR radio show.
We then build a pipeline for systematically pre-processing the text.
Our model successfully identified that Omeprazole can help treat heartburn.
arXiv Detail & Related papers (2020-10-22T19:52:49Z) - BiteNet: Bidirectional Temporal Encoder Network to Predict Medical
Outcomes [53.163089893876645]
We propose a novel self-attention mechanism that captures the contextual dependency and temporal relationships within a patient's healthcare journey.
An end-to-end bidirectional temporal encoder network (BiteNet) then learns representations of the patient's journeys.
We have evaluated the effectiveness of our methods on two supervised prediction and two unsupervised clustering tasks with a real-world EHR dataset.
arXiv Detail & Related papers (2020-09-24T00:42:36Z) - Drug-disease Graph: Predicting Adverse Drug Reaction Signals via Graph
Neural Network with Clinical Data [21.700743167418963]
We develop a novel graph-based framework for ADR signal detection using healthcare claims data.
We apply Graph Neural Network to predict ADR signals, using labels from the Side Effect Resource database.
arXiv Detail & Related papers (2020-04-01T13:01:02Z) - DeepEnroll: Patient-Trial Matching with Deep Embedding and Entailment
Prediction [67.91606509226132]
Clinical trials are essential for drug development but often suffer from expensive, inaccurate and insufficient patient recruitment.
DeepEnroll is a cross-modal inference learning model to jointly encode enrollment criteria (tabular data) into a shared latent space for matching inference.
arXiv Detail & Related papers (2020-01-22T17:51:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.