Related papers: Disease Entity Recognition and Normalization is Improved with Large Language Model Derived Synthetic Normalized Mentions

Disease Entity Recognition and Normalization is Improved with Large Language Model Derived Synthetic Normalized Mentions

URL: http://arxiv.org/abs/2410.07951v1
Date: Thu, 10 Oct 2024 14:18:34 GMT
Title: Disease Entity Recognition and Normalization is Improved with Large Language Model Derived Synthetic Normalized Mentions
Authors: Kuleen Sasse, Shinjitha Vadlakonda, Richard E. Kennedy, John D. Osborne,
Abstract summary: Large Language Model (LLM) generation of synthetic training examples could improve performance in these information extraction tasks. We measured overall and Out of Distribution (OOD) performance for Disease Entity Recognition (DER) and Disease Entity Normalization (DEN) Our synthetic data yielded a substantial improvement for DEN, in all 3 training corpora the top 1 accuracy of both SapBERT and KrissBERT improved by 3-9 points in overall performance and by 20-55 points in OOD data.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Background: Machine learning methods for clinical named entity recognition and entity normalization systems can utilize both labeled corpora and Knowledge Graphs (KGs) for learning. However, infrequently occurring concepts may have few mentions in training corpora and lack detailed descriptions or synonyms, even in large KGs. For Disease Entity Recognition (DER) and Disease Entity Normalization (DEN), this can result in fewer high quality training examples relative to the number of known diseases. Large Language Model (LLM) generation of synthetic training examples could improve performance in these information extraction tasks. Methods: We fine-tuned a LLaMa-2 13B Chat LLM to generate a synthetic corpus containing normalized mentions of concepts from the Unified Medical Language System (UMLS) Disease Semantic Group. We measured overall and Out of Distribution (OOD) performance for DER and DEN, with and without synthetic data augmentation. We evaluated performance on 3 different disease corpora using 4 different data augmentation strategies, assessed using BioBERT for DER and SapBERT and KrissBERT for DEN. Results: Our synthetic data yielded a substantial improvement for DEN, in all 3 training corpora the top 1 accuracy of both SapBERT and KrissBERT improved by 3-9 points in overall performance and by 20-55 points in OOD data. A small improvement (1-2 points) was also seen for DER in overall performance, but only one dataset showed OOD improvement. Conclusion: LLM generation of normalized disease mentions can improve DEN relative to normalization approaches that do not utilize LLMs to augment data with synthetic mentions. Ablation studies indicate that performance gains for DEN were only partially attributable to improvements in OOD performance. The same approach has only a limited ability to improve DER. We make our software and dataset publicly available.

Related papers

A Federated and Parameter-Efficient Framework for Large Language Model Training in Medicine [59.78991974851707]
Large language models (LLMs) have demonstrated strong performance on medical benchmarks, including question answering and diagnosis.<n>Most medical LLMs are trained on data from a single institution, which faces limitations in generalizability and safety in heterogeneous systems.<n>We introduce the model-agnostic and parameter-efficient federated learning framework for adapting LLMs to medical applications.
arXiv Detail & Related papers (2026-01-29T18:48:21Z)
CoCoLIT: ControlNet-Conditioned Latent Image Translation for MRI to Amyloid PET Synthesis [2.333160549379721]
High dimensionality and structural complexity of 3D neuroimaging data pose challenges for MRI-to-PET translation.<n>We present CoCoLIT, a diffusion-based latent generative framework that incorporates three main innovations.<n>We evaluate CoCoLIT's performance on publicly available datasets and find that our model significantly outperforms state-of-the-art methods on both image-based and amyloid-related metrics.
arXiv Detail & Related papers (2025-08-02T09:58:30Z)
Prompt-Guided Latent Diffusion with Predictive Class Conditioning for 3D Prostate MRI Generation [1.6508709227918446]
Latent diffusion models (LDM) could alleviate data scarcity challenges affecting machine learning development for medical imaging.<n>We propose a novel LDM conditioning approach to address these limitations.<n>Our method achieves a 3D FID score of 0.025 on a size-limited 3D prostate MRI dataset.
arXiv Detail & Related papers (2025-06-11T23:12:48Z)
A Benchmark for End-to-End Zero-Shot Biomedical Relation Extraction with LLMs: Experiments with OpenAI Models [7.923208324118286]
We study patterns in the performance of OpenAI LLMs across a diverse sampling of biomedical relation extraction tasks. We found the zero-shot performances to be proximal to that of fine-tuned methods.
arXiv Detail & Related papers (2025-04-05T07:08:54Z)
Fine-Tuning LLMs on Small Medical Datasets: Text Classification and Normalization Effectiveness on Cardiology reports and Discharge records [0.07499722271664144]
We investigate the effectiveness of fine-tuning large language models (LLMs) on small medical datasets for text classification and named entity recognition tasks. Our experiments show that fine-tuning improves performance on both tasks, with notable gains observed with as few as 200-300 training examples.
arXiv Detail & Related papers (2025-03-27T10:35:56Z)
Efficient Multi-Agent System Training with Data Influence-Oriented Tree Search [59.75749613951193]
We propose Data Influence-oriented Tree Search (DITS) to guide both tree search and data selection. By leveraging influence scores, we effectively identify the most impactful data for system improvement. We derive influence score estimation methods tailored for non-differentiable metrics.
arXiv Detail & Related papers (2025-02-02T23:20:16Z)
A Federated Learning-Friendly Approach for Parameter-Efficient Fine-Tuning of SAM in 3D Segmentation [5.011091042850546]
Adapting foundation models for medical image analysis requires finetuning them on a considerable amount of data. collecting task-specific medical data for such finetuning at a central location raises many privacy concerns. Although Federated learning (FL) provides an effective means for training on private decentralized data, communication costs in federating large foundation models can quickly become a significant bottleneck.
arXiv Detail & Related papers (2024-07-31T16:48:06Z)
Improving Entity Recognition Using Ensembles of Deep Learning and Fine-tuned Large Language Models: A Case Study on Adverse Event Extraction from Multiple Sources [13.750202656564907]
Adverse event (AE) extraction is crucial for monitoring and analyzing the safety profiles of immunizations. This study aims to evaluate the effectiveness of large language models (LLMs) and traditional deep learning models in AE extraction.
arXiv Detail & Related papers (2024-06-26T03:56:21Z)
Which Augmentation Should I Use? An Empirical Investigation of Augmentations for Self-Supervised Phonocardiogram Representation Learning [5.438725298163702]
Contrastive Self-Supervised Learning (SSL) offers a potential solution to labeled data scarcity. We propose uncovering the optimal augmentations for applying contrastive learning in 1D phonocardiogram (PCG) classification. We demonstrate that depending on its training distribution, the effectiveness of a fully-supervised model can degrade up to 32%, while SSL models only lose up to 10% or even improve in some cases.
arXiv Detail & Related papers (2023-12-01T11:06:00Z)
Contrast Everything: A Hierarchical Contrastive Framework for Medical Time-Series [12.469204999759965]
We present COMET, an innovative hierarchical framework that leverages data consistencies at all inherent levels in medical time series. Our meticulously designed model systematically captures data consistency from four potential levels: observation, sample, trial, and patient levels. We compare COMET against six baselines using three diverse datasets, which include ECG signals for myocardial infarction and EEG signals for Alzheimer's and Parkinson's diseases.
arXiv Detail & Related papers (2023-10-21T13:59:31Z)
The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation. We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare. Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z)
Learnable Weight Initialization for Volumetric Medical Image Segmentation [66.3030435676252]
We propose a learnable weight-based hybrid medical image segmentation approach. Our approach is easy to integrate into any hybrid model and requires no external training data. Experiments on multi-organ and lung cancer segmentation tasks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-15T17:55:05Z)
Negation detection in Dutch clinical texts: an evaluation of rule-based and machine learning methods [0.21079694661943607]
We compare three methods for negation detection in Dutch clinical notes. We found that both the biLSTM and RoBERTa models consistently outperform the rule-based model in terms of F1 score, precision and recall.
arXiv Detail & Related papers (2022-09-01T14:00:13Z)
BERT WEAVER: Using WEight AVERaging to enable lifelong learning for transformer-based models in biomedical semantic search engines [49.75878234192369]
We present WEAVER, a simple, yet efficient post-processing method that infuses old knowledge into the new model. We show that applying WEAVER in a sequential manner results in similar word embedding distributions as doing a combined training on all data at once.
arXiv Detail & Related papers (2022-02-21T10:34:41Z)
Enhancing the Generalization for Intent Classification and Out-of-Domain Detection in SLU [70.44344060176952]
Intent classification is a major task in spoken language understanding (SLU) Recent works have shown that using extra data and labels can improve the OOD detection performance. This paper proposes to train a model with only IND data while supporting both IND intent classification and OOD detection.
arXiv Detail & Related papers (2021-06-28T08:27:38Z)
Uncovering the structure of clinical EEG signals with self-supervised learning [64.4754948595556]
Supervised learning paradigms are often limited by the amount of labeled data that is available. This phenomenon is particularly problematic in clinically-relevant data, such as electroencephalography (EEG) By extracting information from unlabeled data, it might be possible to reach competitive performance with deep neural networks.
arXiv Detail & Related papers (2020-07-31T14:34:47Z)
Adversarial Feature Hallucination Networks for Few-Shot Learning [84.31660118264514]
Adversarial Feature Hallucination Networks (AFHN) is based on conditional Wasserstein Generative Adversarial networks (cWGAN) Two novel regularizers are incorporated into AFHN to encourage discriminability and diversity of the synthesized features.
arXiv Detail & Related papers (2020-03-30T02:43:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.