A Systematic Literature Review of Automated ICD Coding and
Classification Systems using Discharge Summaries
- URL: http://arxiv.org/abs/2107.10652v1
- Date: Mon, 12 Jul 2021 03:55:17 GMT
- Title: A Systematic Literature Review of Automated ICD Coding and
Classification Systems using Discharge Summaries
- Authors: Rajvir Kaur, Jeewani Anupama Ginige and Oliver Obst
- Abstract summary: Codification of free-text clinical narratives has long been recognised to be beneficial for secondary uses such as funding, insurance claim processing and research.
The current scenario of assigning codes is a manual process which is very expensive, time-consuming and error prone.
This systematic literature review provides a comprehensive overview of automated clinical coding systems.
- Score: 5.156484100374058
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Codification of free-text clinical narratives have long been recognised to be
beneficial for secondary uses such as funding, insurance claim processing and
research. The current scenario of assigning codes is a manual process which is
very expensive, time-consuming and error prone. In recent years, many
researchers have studied the use of Natural Language Processing (NLP), related
Machine Learning (ML) and Deep Learning (DL) methods and techniques to resolve
the problem of manual coding of clinical narratives and to assist human coders
to assign clinical codes more accurately and efficiently. This systematic
literature review provides a comprehensive overview of automated clinical
coding systems that utilises appropriate NLP, ML and DL methods and techniques
to assign ICD codes to discharge summaries. We have followed the Preferred
Reporting Items for Systematic Reviews and Meta-Analyses(PRISMA) guidelines and
conducted a comprehensive search of publications from January, 2010 to December
2020 in four academic databases- PubMed, ScienceDirect, Association for
Computing Machinery(ACM) Digital Library, and the Association for Computational
Linguistics(ACL) Anthology. We reviewed 7,556 publications; 38 met the
inclusion criteria. This review identified: datasets having discharge
summaries; NLP techniques along with some other data extraction processes,
different feature extraction and embedding techniques. To measure the
performance of classification methods, different evaluation metrics are used.
Lastly, future research directions are provided to scholars who are interested
in automated ICD code assignment. Efforts are still required to improve ICD
code prediction accuracy, availability of large-scale de-identified clinical
corpora with the latest version of the classification system. This can be a
platform to guide and share knowledge with the less experienced coders and
researchers.
Related papers
- Auxiliary Knowledge-Induced Learning for Automatic Multi-Label Medical Document Classification [22.323705343864336]
We propose a novel approach for ICD indexing that adopts three ideas.
We use a multi-level deep dilated residual convolution encoder to aggregate the information from the clinical notes.
We formalize the task of ICD classification with auxiliary knowledge of the medical records.
arXiv Detail & Related papers (2024-05-29T13:44:07Z) - CoRelation: Boosting Automatic ICD Coding Through Contextualized Code
Relation Learning [56.782963838838036]
We propose a novel approach, a contextualized and flexible framework, to enhance the learning of ICD code representations.
Our approach employs a dependent learning paradigm that considers the context of clinical notes in modeling all possible code relations.
arXiv Detail & Related papers (2024-02-24T03:25:28Z) - Automatic Coding at Scale: Design and Deployment of a Nationwide System
for Normalizing Referrals in the Chilean Public Healthcare System [0.0]
We propose a two-step system for automatically coding diseases in referrals from the Chilean public healthcare system.
Specifically, our model uses a state-of-the-art NER model for recognizing disease mentions and a search engine system based on for assigning the most relevant codes associated with these disease mentions.
Our system obtained a MAP score of 0.63 for the subcategory level and 0.83 for the category level, close to the best-performing models in the literature.
arXiv Detail & Related papers (2023-07-09T16:19:35Z) - PyTrial: Machine Learning Software and Benchmark for Clinical Trial
Applications [49.69824178329405]
PyTrial provides benchmarks and open-source implementations of a series of machine learning algorithms for clinical trial design and operations.
We thoroughly investigate 34 ML algorithms for clinical trials across 6 different tasks, including patient outcome prediction, trial site selection, trial outcome prediction, patient-trial matching, trial similarity search, and synthetic data generation.
PyTrial defines each task through a simple four-step process: data loading, model specification, model training, and model evaluation, all achievable with just a few lines of code.
arXiv Detail & Related papers (2023-06-06T21:19:03Z) - Development and validation of a natural language processing algorithm to
pseudonymize documents in the context of a clinical data warehouse [53.797797404164946]
The study highlights the difficulties faced in sharing tools and resources in this domain.
We annotated a corpus of clinical documents according to 12 types of identifying entities.
We build a hybrid system, merging the results of a deep learning model as well as manual rules.
arXiv Detail & Related papers (2023-03-23T17:17:46Z) - Can Current Explainability Help Provide References in Clinical Notes to
Support Humans Annotate Medical Codes? [53.45585591262433]
We present an explainable Read, Attend, and Code (xRAC) framework and assess two approaches, attention score-based xRAC-ATTN and model-agnostic knowledge-distillation-based xRAC-KD.
We find that the supporting evidence text highlighted by xRAC-ATTN is of higher quality than xRAC-KD whereas xRAC-KD has potential advantages in production deployment scenarios.
arXiv Detail & Related papers (2022-10-28T04:06:07Z) - GrabQC: Graph based Query Contextualization for automated ICD coding [16.096824533334352]
We propose textbfGrabQC, a textbfGraph textbfbased textbfQuery textbfContextualization method that automatically extracts queries from the clinical text.
We perform experiments on two datasets of clinical text in three different setups to assert the effectiveness of our approach.
arXiv Detail & Related papers (2022-07-14T10:27:25Z) - Classifying Unstructured Clinical Notes via Automatic Weak Supervision [17.45660355026785]
We introduce a general weakly-supervised text classification framework that learns from class-label descriptions only.
We leverage the linguistic domain knowledge stored within pre-trained language models and the data programming framework to assign code labels to texts.
arXiv Detail & Related papers (2022-06-24T05:55:49Z) - Active learning for medical code assignment [55.99831806138029]
We demonstrate the effectiveness of Active Learning (AL) in multi-label text classification in the clinical domain.
We apply a set of well-known AL methods to help automatically assign ICD-9 codes on the MIMIC-III dataset.
Our results show that the selection of informative instances provides satisfactory classification with a significantly reduced training set.
arXiv Detail & Related papers (2021-04-12T18:11:17Z) - A Meta-embedding-based Ensemble Approach for ICD Coding Prediction [64.42386426730695]
International Classification of Diseases (ICD) are the de facto codes used globally for clinical coding.
These codes enable healthcare providers to claim reimbursement and facilitate efficient storage and retrieval of diagnostic information.
Our proposed approach enhances the performance of neural models by effectively training word vectors using routine medical data as well as external knowledge from scientific articles.
arXiv Detail & Related papers (2021-02-26T17:49:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.