Semi-self-supervised Automated ICD Coding
- URL: http://arxiv.org/abs/2205.10088v1
- Date: Fri, 20 May 2022 11:12:54 GMT
- Title: Semi-self-supervised Automated ICD Coding
- Authors: Hlynur D. Hlynsson, Steind\'or Ellertsson, J\'on F. Da{\dh}ason, Emil
L. Sigurdsson, Hrafn Loftsson
- Abstract summary: Clinical Text Notes (CTNs) contain physicians' reasoning process, written in an unstructured free text format, as they examine and interview patients.
This paper presents a method of augmenting a sparsely annotated dataset of Icelandic CTNs with a machine-learned imputation in a semi-self-supervised manner.
We train a neural network on a small set of annotated CTNs and use it to extract clinical features from a set of un-annotated CTNs.
- Score: 2.449909275410288
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Clinical Text Notes (CTNs) contain physicians' reasoning process, written in
an unstructured free text format, as they examine and interview patients. In
recent years, several studies have been published that provide evidence for the
utility of machine learning for predicting doctors' diagnoses from CTNs, a task
known as ICD coding. Data annotation is time consuming, particularly when a
degree of specialization is needed, as is the case for medical data. This paper
presents a method of augmenting a sparsely annotated dataset of Icelandic CTNs
with a machine-learned imputation in a semi-self-supervised manner. We train a
neural network on a small set of annotated CTNs and use it to extract clinical
features from a set of un-annotated CTNs. These clinical features consist of
answers to about a thousand potential questions that a physician might find the
answers to during a consultation of a patient. The features are then used to
train a classifier for the diagnosis of certain types of diseases. We report
the results of an evaluation of this data augmentation method over three tiers
of data availability to the physician. Our data augmentation method shows a
significant positive effect which is diminished when clinical features from the
examination of the patient and diagnostics are made available. We recommend our
method for augmenting scarce datasets for systems that take decisions based on
clinical features that do not include examinations or tests.
Related papers
- Clinical Evaluation of Medical Image Synthesis: A Case Study in Wireless Capsule Endoscopy [63.39037092484374]
This study focuses on the clinical evaluation of medical Synthetic Data Generation using Artificial Intelligence (AI) models.
The paper contributes by a) presenting a protocol for the systematic evaluation of synthetic images by medical experts and b) applying it to assess TIDE-II, a novel variational autoencoder-based model for high-resolution WCE image synthesis.
The results show that TIDE-II generates clinically relevant WCE images, helping to address data scarcity and enhance diagnostic tools.
arXiv Detail & Related papers (2024-10-31T19:48:50Z) - CBIDR: A novel method for information retrieval combining image and data by means of TOPSIS applied to medical diagnosis [1.8416014644193066]
We propose a novel method named CBIDR, which leverage both medical images and clinical data of patient, combining them through the ranking algorithm TOPSIS.
Experimental results in terms of accuracy achieved 97.44% in Top-1 and 100% in Top-5 showing the effectiveness of the proposed approach.
arXiv Detail & Related papers (2024-09-26T16:04:36Z) - A Data-Driven Guided Decoding Mechanism for Diagnostic Captioning [11.817595076396925]
Diagnostic Captioning (DC) automatically generates a diagnostic text from one or more medical images of a patient.
We propose a new data-driven guided decoding method that incorporates medical information into the beam search of the diagnostic text generation process.
We evaluate the proposed method on two medical datasets using four DC systems that range from generic image-to-text systems with CNN encoders to pre-trained Large Language Models.
arXiv Detail & Related papers (2024-06-20T10:08:17Z) - Multimodal Pretraining of Medical Time Series and Notes [45.89025874396911]
Deep learning models show promise in extracting meaningful patterns, but they require extensive labeled data.
We propose a novel approach employing self-supervised pretraining, focusing on the alignment of clinical measurements and notes.
In downstream tasks, including in-hospital mortality prediction and phenotyping, our model outperforms baselines in settings where only a fraction of the data is labeled.
arXiv Detail & Related papers (2023-12-11T21:53:40Z) - Radiology Report Generation Using Transformers Conditioned with
Non-imaging Data [55.17268696112258]
This paper proposes a novel multi-modal transformer network that integrates chest x-ray (CXR) images and associated patient demographic information.
The proposed network uses a convolutional neural network to extract visual features from CXRs and a transformer-based encoder-decoder network that combines the visual features with semantic text embeddings of patient demographic information.
arXiv Detail & Related papers (2023-11-18T14:52:26Z) - Informing clinical assessment by contextualizing post-hoc explanations
of risk prediction models in type-2 diabetes [50.8044927215346]
We consider a comorbidity risk prediction scenario and focus on contexts regarding the patients clinical state.
We employ several state-of-the-art LLMs to present contexts around risk prediction model inferences and evaluate their acceptability.
Our paper is one of the first end-to-end analyses identifying the feasibility and benefits of contextual explanations in a real-world clinical use case.
arXiv Detail & Related papers (2023-02-11T18:07:11Z) - Clinical Evidence Engine: Proof-of-Concept For A
Clinical-Domain-Agnostic Decision Support Infrastructure [26.565616653685115]
We present a proof-of-concept system to demonstrate the technical and design feasibility of this approach across three domains.
Leveraging Clinical BioBERT, the system can effectively identify clinical trial reports based on lengthy clinical questions.
We discuss the idea of designing DST explanations not as specific to a DST or an algorithm, but as a domain-agnostic decision support infrastructure.
arXiv Detail & Related papers (2021-10-31T23:21:25Z) - How to Leverage Multimodal EHR Data for Better Medical Predictions? [13.401754962583771]
The complexity of electronic health records ( EHR) data is a challenge for the application of deep learning.
In this paper, we first extract the accompanying clinical notes from EHR and propose a method to integrate these data.
The results on two medical prediction tasks show that our fused model with different data outperforms the state-of-the-art method.
arXiv Detail & Related papers (2021-10-29T13:26:05Z) - Clinical Outcome Prediction from Admission Notes using Self-Supervised
Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks.
Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets.
We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z) - BiteNet: Bidirectional Temporal Encoder Network to Predict Medical
Outcomes [53.163089893876645]
We propose a novel self-attention mechanism that captures the contextual dependency and temporal relationships within a patient's healthcare journey.
An end-to-end bidirectional temporal encoder network (BiteNet) then learns representations of the patient's journeys.
We have evaluated the effectiveness of our methods on two supervised prediction and two unsupervised clustering tasks with a real-world EHR dataset.
arXiv Detail & Related papers (2020-09-24T00:42:36Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.