The Medical Scribe: Corpus Development and Model Performance Analyses
- URL: http://arxiv.org/abs/2003.11531v1
- Date: Thu, 12 Mar 2020 03:10:25 GMT
- Title: The Medical Scribe: Corpus Development and Model Performance Analyses
- Authors: Izhak Shafran, Nan Du, Linh Tran, Amanda Perry, Lauren Keyes, Mark
Knichel, Ashley Domin, Lei Huang, Yuhui Chen, Gang Li, Mingqiu Wang, Laurent
El Shafey, Hagen Soltau, and Justin S. Paul
- Abstract summary: Motivated by this goal, we developed an annotation scheme to extract relevant clinical concepts.
We used this annotation scheme to label a corpus of about 6k clinical encounters.
This was used to train a state-of-the-art tagging model.
- Score: 19.837396601641117
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There is a growing interest in creating tools to assist in clinical note
generation using the audio of provider-patient encounters. Motivated by this
goal and with the help of providers and medical scribes, we developed an
annotation scheme to extract relevant clinical concepts. We used this
annotation scheme to label a corpus of about 6k clinical encounters. This was
used to train a state-of-the-art tagging model. We report ontologies, labeling
results, model performances, and detailed analyses of the results. Our results
show that the entities related to medications can be extracted with a
relatively high accuracy of 0.90 F-score, followed by symptoms at 0.72 F-score,
and conditions at 0.57 F-score. In our task, we not only identify where the
symptoms are mentioned but also map them to canonical forms as they appear in
the clinical notes. Of the different types of errors, in about 19-38% of the
cases, we find that the model output was correct, and about 17-32% of the
errors do not impact the clinical note. Taken together, the models developed in
this work are more useful than the F-scores reflect, making it a promising
approach for practical applications.
Related papers
- A Federated Learning Framework for Stenosis Detection [70.27581181445329]
This study explores the use of Federated Learning (FL) for stenosis detection in coronary angiography images (CA)
Two heterogeneous datasets from two institutions were considered: dataset 1 includes 1219 images from 200 patients, which we acquired at the Ospedale Riuniti of Ancona (Italy)
dataset 2 includes 7492 sequential images from 90 patients from a previous study available in the literature.
arXiv Detail & Related papers (2023-10-30T11:13:40Z) - CORAL: Expert-Curated medical Oncology Reports to Advance Language Model
Inference [2.1067045507411195]
Large language models (LLMs) have recently exhibited impressive performance on various medical natural language processing tasks.
We developed a detailed schema for annotating textual oncology information, encompassing patient characteristics, tumor characteristics, tests, treatments, and temporality.
The GPT-4 model exhibited overall best performance, with an average BLEU score of 0.73, an average ROUGE score of 0.72, an exact-match F1-score of 0.51, and an average accuracy of 68% on complex tasks.
arXiv Detail & Related papers (2023-08-07T18:03:10Z) - Development and validation of a natural language processing algorithm to
pseudonymize documents in the context of a clinical data warehouse [53.797797404164946]
The study highlights the difficulties faced in sharing tools and resources in this domain.
We annotated a corpus of clinical documents according to 12 types of identifying entities.
We build a hybrid system, merging the results of a deep learning model as well as manual rules.
arXiv Detail & Related papers (2023-03-23T17:17:46Z) - This Patient Looks Like That Patient: Prototypical Networks for
Interpretable Diagnosis Prediction from Clinical Text [56.32427751440426]
In clinical practice such models must not only be accurate, but provide doctors with interpretable and helpful results.
We introduce ProtoPatient, a novel method based on prototypical networks and label-wise attention.
We evaluate the model on two publicly available clinical datasets and show that it outperforms existing baselines.
arXiv Detail & Related papers (2022-10-16T10:12:07Z) - Learning to diagnose common thorax diseases on chest radiographs from
radiology reports in Vietnamese [0.33598755777055367]
We propose a data collecting and annotation pipeline that extracts information from Vietnamese radiology reports to provide accurate labels for chest X-ray (CXR) images.
This can benefit Vietnamese radiologists and clinicians by annotating data that closely match their endemic diagnosis categories which may vary from country to country.
arXiv Detail & Related papers (2022-09-11T06:06:03Z) - Few-Shot Cross-lingual Transfer for Coarse-grained De-identification of
Code-Mixed Clinical Texts [56.72488923420374]
Pre-trained language models (LMs) have shown great potential for cross-lingual transfer in low-resource settings.
We show the few-shot cross-lingual transfer property of LMs for named recognition (NER) and apply it to solve a low-resource and real-world challenge of code-mixed (Spanish-Catalan) clinical notes de-identification in the stroke.
arXiv Detail & Related papers (2022-04-10T21:46:52Z) - Human Evaluation and Correlation with Automatic Metrics in Consultation
Note Generation [56.25869366777579]
In recent years, machine learning models have rapidly become better at generating clinical consultation notes.
We present an extensive human evaluation study where 5 clinicians listen to 57 mock consultations, write their own notes, post-edit a number of automatically generated notes, and extract all the errors.
We find that a simple, character-based Levenshtein distance metric performs on par if not better than common model-based metrics like BertScore.
arXiv Detail & Related papers (2022-04-01T14:04:16Z) - Assessment of contextualised representations in detecting outcome
phrases in clinical trials [14.584741378279316]
We introduce "EBM-COMET", a dataset in which 300 PubMed abstracts are expertly annotated for clinical outcomes.
To extract outcomes, we fine-tune a variety of pre-trained contextualized representations.
We observe our best model (BioBERT) achieve 81.5% F1, 81.3% sensitivity and 98.0% specificity.
arXiv Detail & Related papers (2022-02-13T15:08:00Z) - TrialGraph: Machine Intelligence Enabled Insight from Graph Modelling of
Clinical Trials [0.0]
We introduce a curated clinical trial data set compiled from the CT.gov, AACT and TrialTrove databases (n=1191 trials; representing one million patients)
We then detail the mathematical basis and implementation of a selection of graph machine learning algorithms.
We trained these models to predict side effect information for a clinical trial given information on the disease, existing medical conditions, and treatment.
arXiv Detail & Related papers (2021-12-15T15:36:57Z) - What Do You See in this Patient? Behavioral Testing of Clinical NLP
Models [69.09570726777817]
We introduce an extendable testing framework that evaluates the behavior of clinical outcome models regarding changes of the input.
We show that model behavior varies drastically even when fine-tuned on the same data and that allegedly best-performing models have not always learned the most medically plausible patterns.
arXiv Detail & Related papers (2021-11-30T15:52:04Z) - MIMICause : Defining, identifying and predicting types of causal
relationships between biomedical concepts from clinical notes [0.0]
We propose annotation guidelines, develop an annotated corpus and provide baseline scores to identify types and direction of causal relations between a pair of biomedical concepts in clinical notes.
We annotate a total of 2714 de-identified examples sampled from the 2018 n2c2 shared task dataset and train four different language model based architectures.
The high inter-annotator agreement for clinical text shows the quality of our annotation guidelines while the provided baseline F1 score sets the direction for future research towards understanding narratives in clinical texts.
arXiv Detail & Related papers (2021-10-14T00:15:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.