Code Like Humans: A Multi-Agent Solution for Medical Coding
- URL: http://arxiv.org/abs/2509.05378v3
- Date: Wed, 08 Oct 2025 05:07:35 GMT
- Title: Code Like Humans: A Multi-Agent Solution for Medical Coding
- Authors: Andreas Motzfeldt, Joakim Edin, Casper L. Christensen, Christian Hardmeier, Lars Maaløe, Anna Rogers,
- Abstract summary: We introduce Code Like Humans: a new agentic framework for medical coding with large language models.<n>It implements official coding guidelines for human experts, and it is the first solution that can support the full ICD-10 coding system.<n>It achieves the best performance to date on rare diagnosis codes.
- Score: 13.1324232915195
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In medical coding, experts map unstructured clinical notes to alphanumeric codes for diagnoses and procedures. We introduce Code Like Humans: a new agentic framework for medical coding with large language models. It implements official coding guidelines for human experts, and it is the first solution that can support the full ICD-10 coding system (+70K labels). It achieves the best performance to date on rare diagnosis codes (fine-tuned discriminative classifiers retain an advantage for high-frequency codes, to which they are limited). Towards future work, we also contribute an analysis of system performance and identify its `blind spots' (codes that are systematically undercoded).
Related papers
- MedDCR: Learning to Design Agentic Workflows for Medical Coding [55.51674334874892]
Medical coding converts free-text clinical notes into standardized diagnostic and procedural codes.<n>We present MedDCR, a closed-loop framework that treats design as a learning problem.<n>On benchmark datasets, MedDCR outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2025-11-17T13:30:51Z) - Assisted morbidity coding: the SISCO.web use case for identifying the main diagnosis in Hospital Discharge Records [0.0]
The paper aims to present the SISCO.web approach designed to support physicians in filling in Hospital Discharge Records with proper diagnoses and procedures codes.<n>The web service leverages NLP algorithms, specific coding rules, as well as ad hoc decision trees to identify the main condition.
arXiv Detail & Related papers (2024-12-11T16:08:25Z) - RedCode: Risky Code Execution and Generation Benchmark for Code Agents [50.81206098588923]
RedCode is a benchmark for risky code execution and generation.
RedCode-Exec provides challenging prompts that could lead to risky code execution.
RedCode-Gen provides 160 prompts with function signatures and docstrings as input to assess whether code agents will follow instructions.
arXiv Detail & Related papers (2024-11-12T13:30:06Z) - CodeIP: A Grammar-Guided Multi-Bit Watermark for Large Language Models of Code [56.019447113206006]
Large Language Models (LLMs) have achieved remarkable progress in code generation.<n>CodeIP is a novel multi-bit watermarking technique that inserts additional information to preserve provenance details.<n>Experiments conducted on a real-world dataset across five programming languages demonstrate the effectiveness of CodeIP.
arXiv Detail & Related papers (2024-04-24T04:25:04Z) - A Novel ICD Coding Method Based on Associated and Hierarchical Code Description Distillation [6.524062529847299]
ICD coding is a challenging multilabel text classification problem due to noisy medical document inputs.
Recent advancements in automated ICD coding have enhanced performance by integrating additional data and knowledge bases with the encoding of medical notes and codes.
We propose a novel framework based on associated and hierarchical code description distillation (AHDD) for better code representation learning and avoidance of improper code assignment.
arXiv Detail & Related papers (2024-04-17T07:26:23Z) - InfiBench: Evaluating the Question-Answering Capabilities of Code Large Language Models [56.723509505549536]
InfiBench is the first large-scale freeform question-answering (QA) benchmark for code to our knowledge.
It comprises 234 carefully selected high-quality Stack Overflow questions that span across 15 programming languages.
We conduct a systematic evaluation for over 100 latest code LLMs on InfiBench, leading to a series of novel and insightful findings.
arXiv Detail & Related papers (2024-03-11T02:06:30Z) - Automated clinical coding using off-the-shelf large language models [10.365958121087305]
The task of assigning diagnostic ICD codes to patient hospital admissions is typically performed by expert human coders.
Efforts towards automated ICD coding are dominated by supervised deep learning models.
In this work, we leverage off-the-shelf pre-trained generative large language models to develop a practical solution.
arXiv Detail & Related papers (2023-10-10T11:56:48Z) - A Two-Stage Decoder for Efficient ICD Coding [10.634394331433322]
We propose a two-stage decoding mechanism to predict ICD codes.
At first, we predict the parent code and then predict the child code based on the previous prediction.
Experiments on the public MIMIC-III data set show that our model performs well in single-model settings.
arXiv Detail & Related papers (2023-05-27T17:25:13Z) - Medical Codes Prediction from Clinical Notes: From Human Coders to
Machines [0.21320960069210473]
Prediction of medical codes from clinical notes is a practical and essential need for every healthcare delivery organization.
The biggest challenge is directly identifying appropriate medical codes from several thousands of high-dimensional codes from unstructured free-text clinical notes.
Recent studies have shown the state-of-the-art code prediction results of full-fledged deep learning-based methods.
arXiv Detail & Related papers (2022-10-30T14:24:13Z) - Can Current Explainability Help Provide References in Clinical Notes to
Support Humans Annotate Medical Codes? [53.45585591262433]
We present an explainable Read, Attend, and Code (xRAC) framework and assess two approaches, attention score-based xRAC-ATTN and model-agnostic knowledge-distillation-based xRAC-KD.
We find that the supporting evidence text highlighted by xRAC-ATTN is of higher quality than xRAC-KD whereas xRAC-KD has potential advantages in production deployment scenarios.
arXiv Detail & Related papers (2022-10-28T04:06:07Z) - Revisiting Code Search in a Two-Stage Paradigm [67.02322603435628]
TOSS is a two-stage fusion code search framework.
It first uses IR-based and bi-encoder models to efficiently recall a small number of top-k code candidates.
It then uses fine-grained cross-encoders for finer ranking.
arXiv Detail & Related papers (2022-08-24T02:34:27Z) - A Meta-embedding-based Ensemble Approach for ICD Coding Prediction [64.42386426730695]
International Classification of Diseases (ICD) are the de facto codes used globally for clinical coding.
These codes enable healthcare providers to claim reimbursement and facilitate efficient storage and retrieval of diagnostic information.
Our proposed approach enhances the performance of neural models by effectively training word vectors using routine medical data as well as external knowledge from scientific articles.
arXiv Detail & Related papers (2021-02-26T17:49:58Z) - Inheritance-guided Hierarchical Assignment for Clinical Automatic
Diagnosis [50.15205065710629]
Clinical diagnosis, which aims to assign diagnosis codes for a patient based on the clinical note, plays an essential role in clinical decision-making.
We propose a novel framework to combine the inheritance-guided hierarchical assignment and co-occurrence graph propagation for clinical automatic diagnosis.
arXiv Detail & Related papers (2021-01-27T13:16:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.