Extracting COVID-19 Diagnoses and Symptoms From Clinical Text: A New
Annotated Corpus and Neural Event Extraction Framework
- URL: http://arxiv.org/abs/2012.00974v2
- Date: Wed, 10 Mar 2021 21:36:43 GMT
- Title: Extracting COVID-19 Diagnoses and Symptoms From Clinical Text: A New
Annotated Corpus and Neural Event Extraction Framework
- Authors: Kevin Lybarger, Mari Ostendorf, Matthew Thompson, Meliha Yetisgen
- Abstract summary: This work presents a new clinical corpus, referred to as the COVID-19 Annotated Clinical Text (CACT) Corpus.
It comprises 1,472 notes with detailed annotations characterizing COVID-19 diagnoses, testing, and clinical presentation.
We introduce a span-based event extraction model that jointly extracts all annotated phenomena.
In a secondary use application, we explored the prediction of COVID-19 test results using structured patient data and automatically extracted symptom information.
- Score: 14.226438210255676
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Coronavirus disease 2019 (COVID-19) is a global pandemic. Although much has
been learned about the novel coronavirus since its emergence, there are many
open questions related to tracking its spread, describing symptomology,
predicting the severity of infection, and forecasting healthcare utilization.
Free-text clinical notes contain critical information for resolving these
questions. Data-driven, automatic information extraction models are needed to
use this text-encoded information in large-scale studies. This work presents a
new clinical corpus, referred to as the COVID-19 Annotated Clinical Text (CACT)
Corpus, which comprises 1,472 notes with detailed annotations characterizing
COVID-19 diagnoses, testing, and clinical presentation. We introduce a
span-based event extraction model that jointly extracts all annotated
phenomena, achieving high performance in identifying COVID-19 and symptom
events with associated assertion values (0.83-0.97 F1 for events and 0.73-0.79
F1 for assertions). In a secondary use application, we explored the prediction
of COVID-19 test results using structured patient data (e.g. vital signs and
laboratory results) and automatically extracted symptom information. The
automatically extracted symptoms improve prediction performance, beyond
structured data alone.
Related papers
- Domain Adaptation Using Pseudo Labels for COVID-19 Detection [19.844531606142496]
We present a two-stage framework that leverages pseudo labels for domain adaptation to enhance the detection of COVID-19 from CT scans.
By utilizing annotated data from one domain and non-annotated data from another, the model overcomes the challenge of data scarcity and variability.
Experimental results on COV19-CT-DB database showcase the model's potential to achieve high diagnostic precision.
arXiv Detail & Related papers (2024-03-18T06:07:45Z) - The pitfalls of using open data to develop deep learning solutions for
COVID-19 detection in chest X-rays [64.02097860085202]
Deep learning models have been developed to identify COVID-19 from chest X-rays.
Results have been exceptional when training and testing on open-source data.
Data analysis and model evaluations show that the popular open-source dataset COVIDx is not representative of the real clinical problem.
arXiv Detail & Related papers (2021-09-14T10:59:11Z) - Clinical Utility of the Automatic Phenotype Annotation in Unstructured
Clinical Notes: ICU Use Cases [11.22817749252584]
We propose the automatic annotation of phenotypes from clinical notes as a method to capture essential information to predict outcomes in the Intensive Care Unit.
We demonstrate and validate our approach conducting experiments on the prediction of in-hospital mortality, physiological decompensation and length of stay in the ICU setting.
arXiv Detail & Related papers (2021-07-24T17:55:55Z) - COVIDx-US -- An open-access benchmark dataset of ultrasound imaging data
for AI-driven COVID-19 analytics [116.6248556979572]
COVIDx-US is an open-access benchmark dataset of COVID-19 related ultrasound imaging data.
It consists of 93 lung ultrasound videos and 10,774 processed images of patients infected with SARS-CoV-2 pneumonia, non-SARS-CoV-2 pneumonia, as well as healthy control cases.
arXiv Detail & Related papers (2021-03-18T03:31:33Z) - HINT: Hierarchical Interaction Network for Trial Outcome Prediction
Leveraging Web Data [56.53715632642495]
Clinical trials face uncertain outcomes due to issues with efficacy, safety, or problems with patient recruitment.
In this paper, we propose Hierarchical INteraction Network (HINT) for more general, clinical trial outcome predictions.
arXiv Detail & Related papers (2021-02-08T15:09:07Z) - Improving Clinical Document Understanding on COVID-19 Research with
Spark NLP [0.0]
Following the global COVID-19 pandemic, the number of scientific papers studying the virus has grown massively.
We present a clinical text mining system that improves on previous efforts in three ways.
First, it can recognize over 100 different entity types including social determinants of health, anatomy, risk factors, and adverse events.
Second, the text processing pipeline includes assertion status detection, to distinguish between clinical facts that are present, absent, conditional, or about someone other than the patient.
arXiv Detail & Related papers (2020-12-07T19:17:05Z) - Classification supporting COVID-19 diagnostics based on patient survey
data [82.41449972618423]
logistic regression and XGBoost classifiers, that allow for effective screening of patients for COVID-19 were generated.
The obtained classification models provided the basis for the DECODE service (decode.polsl.pl), which can serve as support in screening patients with COVID-19 disease.
This data set consists of more than 3,000 examples is based on questionnaires collected at a hospital in Poland.
arXiv Detail & Related papers (2020-11-24T17:44:01Z) - An efficient representation of chronological events in medical texts [9.118144540451514]
We proposed a systematic methodology for learning from chronological events available in clinical notes.
The proposed methodological it path signature framework creates a non-parametric hierarchical representation of sequential events of any type.
The methodology was developed and externally validated using the largest in the UK secondary care mental health EHR data.
arXiv Detail & Related papers (2020-10-16T14:54:29Z) - Integrative Analysis for COVID-19 Patient Outcome Prediction [53.11258640541513]
We combine radiomics of lung opacities and non-imaging features from demographic data, vital signs, and laboratory findings to predict need for intensive care unit admission.
Our methods may also be applied to other lung diseases including but not limited to community acquired pneumonia.
arXiv Detail & Related papers (2020-07-20T19:08:50Z) - COVID-19 SignSym: a fast adaptation of a general clinical NLP tool to
identify and normalize COVID-19 signs and symptoms to OMOP common data model [15.475106287218727]
This study aims at adapting the CLAMP natural language processing tool to build COVID-19 SignSym.
COVID-19 SignSym can extract COVID-19 signs/symptoms and their 8 attributes from clinical text.
A hybrid approach of combining deep learning-based models, curated lexicons, and pattern-based rules was applied to build the COVID-19 SignSym.
arXiv Detail & Related papers (2020-07-13T15:57:26Z) - Diagnosis of Coronavirus Disease 2019 (COVID-19) with Structured Latent
Multi-View Representation Learning [48.05232274463484]
Recently, the outbreak of Coronavirus Disease 2019 (COVID-19) has spread rapidly across the world.
Due to the large number of affected patients and heavy labor for doctors, computer-aided diagnosis with machine learning algorithm is urgently needed.
In this study, we propose to conduct the diagnosis of COVID-19 with a series of features extracted from CT images.
arXiv Detail & Related papers (2020-05-06T15:19:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.