Aligning AI Research with the Needs of Clinical Coding Workflows: Eight Recommendations Based on US Data Analysis and Critical Review
- URL: http://arxiv.org/abs/2412.18043v1
- Date: Mon, 23 Dec 2024 23:39:05 GMT
- Title: Aligning AI Research with the Needs of Clinical Coding Workflows: Eight Recommendations Based on US Data Analysis and Critical Review
- Authors: Yidong Gan, Maciej Rybinski, Ben Hachey, Jonathan K. Kummerfeld,
- Abstract summary: This position paper aims to align AI coding research more closely with practical challenges of clinical coding.
Based on our analysis, we offer eight specific recommendations, suggesting ways to improve current evaluation methods.
- Score: 14.381199039813675
- License:
- Abstract: Clinical coding is crucial for healthcare billing and data analysis. Manual clinical coding is labour-intensive and error-prone, which has motivated research towards full automation of the process. However, our analysis, based on US English electronic health records and automated coding research using these records, shows that widely used evaluation methods are not aligned with real clinical contexts. For example, evaluations that focus on the top 50 most common codes are an oversimplification, as there are thousands of codes used in practice. This position paper aims to align AI coding research more closely with practical challenges of clinical coding. Based on our analysis, we offer eight specific recommendations, suggesting ways to improve current evaluation methods. Additionally, we propose new AI-based methods beyond automated coding, suggesting alternative approaches to assist clinical coders in their workflows.
Related papers
- Large Language Models in the Clinic: A Comprehensive Benchmark [63.21278434331952]
We build a benchmark ClinicBench to better understand large language models (LLMs) in the clinic.
We first collect eleven existing datasets covering diverse clinical language generation, understanding, and reasoning tasks.
We then construct six novel datasets and clinical tasks that are complex but common in real-world practice.
We conduct an extensive evaluation of twenty-two LLMs under both zero-shot and few-shot settings.
arXiv Detail & Related papers (2024-04-25T15:51:06Z) - CoRelation: Boosting Automatic ICD Coding Through Contextualized Code
Relation Learning [56.782963838838036]
We propose a novel approach, a contextualized and flexible framework, to enhance the learning of ICD code representations.
Our approach employs a dependent learning paradigm that considers the context of clinical notes in modeling all possible code relations.
arXiv Detail & Related papers (2024-02-24T03:25:28Z) - Automated Clinical Coding for Outpatient Departments [14.923343535929515]
This paper is the first to investigate how well state-of-the-art deep learning-based clinical coding approaches work in the outpatient setting at hospital scale.
We collect a large outpatient dataset comprising over 7 million notes documenting over half a million patients.
We adapt four state-of-the-art clinical coding approaches to this setting and evaluate their potential to assist coders.
arXiv Detail & Related papers (2023-12-21T02:28:29Z) - Automated clinical coding using off-the-shelf large language models [10.365958121087305]
The task of assigning diagnostic ICD codes to patient hospital admissions is typically performed by expert human coders.
Efforts towards automated ICD coding are dominated by supervised deep learning models.
In this work, we leverage off-the-shelf pre-trained generative large language models to develop a practical solution.
arXiv Detail & Related papers (2023-10-10T11:56:48Z) - PyTrial: Machine Learning Software and Benchmark for Clinical Trial
Applications [49.69824178329405]
PyTrial provides benchmarks and open-source implementations of a series of machine learning algorithms for clinical trial design and operations.
We thoroughly investigate 34 ML algorithms for clinical trials across 6 different tasks, including patient outcome prediction, trial site selection, trial outcome prediction, patient-trial matching, trial similarity search, and synthetic data generation.
PyTrial defines each task through a simple four-step process: data loading, model specification, model training, and model evaluation, all achievable with just a few lines of code.
arXiv Detail & Related papers (2023-06-06T21:19:03Z) - GrabQC: Graph based Query Contextualization for automated ICD coding [16.096824533334352]
We propose textbfGrabQC, a textbfGraph textbfbased textbfQuery textbfContextualization method that automatically extracts queries from the clinical text.
We perform experiments on two datasets of clinical text in three different setups to assert the effectiveness of our approach.
arXiv Detail & Related papers (2022-07-14T10:27:25Z) - Automated Clinical Coding: What, Why, and Where We Are? [17.086212195006894]
Clinical coding could potentially be supported by an automated system to improve the efficiency and accuracy of the process.
Our research reveals the gaps between the current deep learning-based approach applied to clinical coding and the need for explainability and consistency in real-world practice.
There is much to achieve to develop and deploy an AI-based automated system to support coding in the next five years and beyond.
arXiv Detail & Related papers (2022-03-21T16:17:38Z) - A Systematic Literature Review of Automated ICD Coding and
Classification Systems using Discharge Summaries [5.156484100374058]
Codification of free-text clinical narratives has long been recognised to be beneficial for secondary uses such as funding, insurance claim processing and research.
The current scenario of assigning codes is a manual process which is very expensive, time-consuming and error prone.
This systematic literature review provides a comprehensive overview of automated clinical coding systems.
arXiv Detail & Related papers (2021-07-12T03:55:17Z) - Active learning for medical code assignment [55.99831806138029]
We demonstrate the effectiveness of Active Learning (AL) in multi-label text classification in the clinical domain.
We apply a set of well-known AL methods to help automatically assign ICD-9 codes on the MIMIC-III dataset.
Our results show that the selection of informative instances provides satisfactory classification with a significantly reduced training set.
arXiv Detail & Related papers (2021-04-12T18:11:17Z) - A Meta-embedding-based Ensemble Approach for ICD Coding Prediction [64.42386426730695]
International Classification of Diseases (ICD) are the de facto codes used globally for clinical coding.
These codes enable healthcare providers to claim reimbursement and facilitate efficient storage and retrieval of diagnostic information.
Our proposed approach enhances the performance of neural models by effectively training word vectors using routine medical data as well as external knowledge from scientific articles.
arXiv Detail & Related papers (2021-02-26T17:49:58Z) - Benchmarking Automated Clinical Language Simplification: Dataset,
Algorithm, and Evaluation [48.87254340298189]
We construct a new dataset named MedLane to support the development and evaluation of automated clinical language simplification approaches.
We propose a new model called DECLARE that follows the human annotation procedure and achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-12-04T06:09:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.