Improving Drug Identification in Overdose Death Surveillance using Large Language Models
- URL: http://arxiv.org/abs/2507.12679v1
- Date: Wed, 16 Jul 2025 23:29:19 GMT
- Title: Improving Drug Identification in Overdose Death Surveillance using Large Language Models
- Authors: Arthur J. Funnell, Panayiotis Petousis, Fabrice Harel-Canada, Ruby Romero, Alex A. T. Bui, Adam Koncsol, Hritika Chaturvedi, Chelsea Shover, David Goodman-Meza,
- Abstract summary: The rising rate of drug-related deaths in the United States, largely driven by fentanyl, requires timely and accurate surveillance.<n> critical overdose data are often buried in free-text coroner reports, leading to delays and information loss when coded into ICD-10 classifications.<n>Natural language processing models may automate and enhance overdose surveillance, but prior applications have been limited.
- Score: 1.8239746935427605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rising rate of drug-related deaths in the United States, largely driven by fentanyl, requires timely and accurate surveillance. However, critical overdose data are often buried in free-text coroner reports, leading to delays and information loss when coded into ICD (International Classification of Disease)-10 classifications. Natural language processing (NLP) models may automate and enhance overdose surveillance, but prior applications have been limited. A dataset of 35,433 death records from multiple U.S. jurisdictions in 2020 was used for model training and internal testing. External validation was conducted using a novel separate dataset of 3,335 records from 2023-2024. Multiple NLP approaches were evaluated for classifying specific drug involvement from unstructured death certificate text. These included traditional single- and multi-label classifiers, as well as fine-tuned encoder-only language models such as Bidirectional Encoder Representations from Transformers (BERT) and BioClinicalBERT, and contemporary decoder-only large language models such as Qwen 3 and Llama 3. Model performance was assessed using macro-averaged F1 scores, and 95% confidence intervals were calculated to quantify uncertainty. Fine-tuned BioClinicalBERT models achieved near-perfect performance, with macro F1 scores >=0.998 on the internal test set. External validation confirmed robustness (macro F1=0.966), outperforming conventional machine learning, general-domain BERT models, and various decoder-only large language models. NLP models, particularly fine-tuned clinical variants like BioClinicalBERT, offer a highly accurate and scalable solution for overdose death classification from free-text reports. These methods can significantly accelerate surveillance workflows, overcoming the limitations of manual ICD-10 coding and supporting near real-time detection of emerging substance use trends.
Related papers
- SpiroLLM: Finetuning Pretrained LLMs to Understand Spirogram Time Series with Clinical Validation in COPD Reporting [11.789239660318337]
COPD is a major chronic respiratory disease with persistent airflow limitation.<n>Current AI models for COPD diagnosis are limited to outputting classification results.<n>We propose SpiroLLM, the first multimodal large language model that can understand spirogram.
arXiv Detail & Related papers (2025-07-22T01:44:12Z) - CRTRE: Causal Rule Generation with Target Trial Emulation Framework [47.2836994469923]
We introduce a novel method called causal rule generation with target trial emulation framework (CRTRE)
CRTRE applies randomize trial design principles to estimate the causal effect of association rules.
We then incorporate such association rules for the downstream applications such as prediction of disease onsets.
arXiv Detail & Related papers (2024-11-10T02:40:06Z) - Multi-stream deep learning framework to predict mild cognitive impairment with Rey Complex Figure Test [10.324611550865926]
We developed a multi-stream deep learning framework that integrates two distinct processing streams.
The proposed multi-stream model demonstrated superior performance over baseline models in external validation.
Our model has practical implications for clinical settings, where it could serve as a cost-effective tool for early screening.
arXiv Detail & Related papers (2024-09-04T17:08:04Z) - Improving ICD coding using Chapter based Named Entities and Attentional Models [0.0]
We introduce an enhanced approach to ICD coding that improves F1 scores by using chapter-based named entities and attentional models.
This method categorizes discharge summaries into ICD-9 Chapters and develops attentional models with chapter-specific data.
For categorization, we use Chapter-IV to de-bias and influence key entities and weights without neural networks.
arXiv Detail & Related papers (2024-07-24T12:34:23Z) - Deep Omni-supervised Learning for Rib Fracture Detection from Chest
Radiology Images [41.62893318123283]
Deep learning (DL)-based rib fracture detection has shown promise of playing an important role in preventing mortality and improving patient outcome.
DL-based object detection models requires a huge amount of bounding box annotation.
Annotating medical data is time-consuming and expertise-demanding, making obtaining a large amount of fine-grained annotations extremely infeasible.
We present a novel omni-supervised object detection network, ORF-Netv2, to leverage as much available supervision as possible.
arXiv Detail & Related papers (2023-06-23T05:36:03Z) - Conservative Prediction via Data-Driven Confidence Minimization [70.93946578046003]
In safety-critical applications of machine learning, it is often desirable for a model to be conservative.
We propose the Data-Driven Confidence Minimization framework, which minimizes confidence on an uncertainty dataset.
arXiv Detail & Related papers (2023-06-08T07:05:36Z) - Continuous time recurrent neural networks: overview and application to
forecasting blood glucose in the intensive care unit [56.801856519460465]
Continuous time autoregressive recurrent neural networks (CTRNNs) are a deep learning model that account for irregular observations.
We demonstrate the application of these models to probabilistic forecasting of blood glucose in a critical care setting.
arXiv Detail & Related papers (2023-04-14T09:39:06Z) - Clinical Deterioration Prediction in Brazilian Hospitals Based on
Artificial Neural Networks and Tree Decision Models [56.93322937189087]
An extremely boosted neural network (XBNet) is used to predict clinical deterioration (CD)
The XGBoost model obtained the best results in predicting CD among Brazilian hospitals' data.
arXiv Detail & Related papers (2022-12-17T23:29:14Z) - Few-Shot Cross-lingual Transfer for Coarse-grained De-identification of
Code-Mixed Clinical Texts [56.72488923420374]
Pre-trained language models (LMs) have shown great potential for cross-lingual transfer in low-resource settings.
We show the few-shot cross-lingual transfer property of LMs for named recognition (NER) and apply it to solve a low-resource and real-world challenge of code-mixed (Spanish-Catalan) clinical notes de-identification in the stroke.
arXiv Detail & Related papers (2022-04-10T21:46:52Z) - Secondary Use of Clinical Problem List Entries for Neural Network-Based
Disease Code Assignment [1.3190581566723918]
We explore automated coding of 50 character long clinical problem list entries using the International Classification of Diseases (ICD-10)
A fastText baseline reached a macro-averaged F1-score of 0.83, followed by a character-level LSTM with a macro-averaged F1-score of 0.84.
A neural network activation analysis together with an investigation of the false positives and false negatives unveiled inconsistent manual coding as a main limiting factor.
arXiv Detail & Related papers (2021-12-27T16:11:05Z) - Multiple Organ Failure Prediction with Classifier-Guided Generative
Adversarial Imputation Networks [4.040013871160853]
Multiple organ failure (MOF) is a severe syndrome with a high mortality rate among Intensive Care Unit (ICU) patients.
Applying machine learning models to electronic health records is a challenge due to the pervasiveness of missing values.
arXiv Detail & Related papers (2021-06-22T15:49:01Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.