Ontology-based Semantic Similarity Measures for Clustering Medical Concepts in Drug Safety
- URL: http://arxiv.org/abs/2503.20737v2
- Date: Fri, 11 Apr 2025 18:03:25 GMT
- Title: Ontology-based Semantic Similarity Measures for Clustering Medical Concepts in Drug Safety
- Authors: Jeffery L Painter, François Haguinet, Gregory E Powell, Andrew Bate,
- Abstract summary: Six semantic similarity measures (SSMs) were evaluated for clustering MedDRA Preferred Terms (PTs) in drug safety data.<n>We found that intrinsic information content (IC)-based measures, especially INTRINSIC-LIN and SOKAL, consistently yield better clustering accuracy.<n>Our findings highlight the promise of IC-based SSMs in enhancing pharmacovigilance by improving early signal detection and reducing manual review.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semantic similarity measures (SSMs) are widely used in biomedical research but remain underutilized in pharmacovigilance. This study evaluates six ontology-based SSMs for clustering MedDRA Preferred Terms (PTs) in drug safety data. Using the Unified Medical Language System (UMLS), we assess each method's ability to group PTs around medically meaningful centroids. A high-throughput framework was developed with a Java API and Python and R interfaces support large-scale similarity computations. Results show that while path-based methods perform moderately with F1 scores of 0.36 for WUPALMER and 0.28 for LCH, intrinsic information content (IC)-based measures, especially INTRINSIC-LIN and SOKAL, consistently yield better clustering accuracy (F1 score of 0.403). Validated against expert review and standard MedDRA queries (SMQs), our findings highlight the promise of IC-based SSMs in enhancing pharmacovigilance workflows by improving early signal detection and reducing manual review.
Related papers
- LLM-based Prompt Ensemble for Reliable Medical Entity Recognition from EHRs [4.262074310505135]
This paper explores prompt-based medical entity recognition using large language models (LLMs)<n>GPT-4o with prompt ensemble achieved the highest classification performance with an F1-score of 0.95 and recall of 0.98.<n>The ensemble method improved reliability by aggregating outputs through embedding-based similarity and majority voting.
arXiv Detail & Related papers (2025-05-13T16:11:29Z) - Early Detection of Multidrug Resistance Using Multivariate Time Series Analysis and Interpretable Patient-Similarity Representations [8.062368743143388]
Multidrug Resistance (MDR) is a critical global health issue, causing increased hospital stays, healthcare costs, and mortality.
This study proposes an interpretable Machine Learning framework for MDR prediction, aiming for both accurate inference and enhanced explainability.
arXiv Detail & Related papers (2025-04-24T16:19:13Z) - Med-CoDE: Medical Critique based Disagreement Evaluation Framework [72.42301910238861]
The reliability and accuracy of large language models (LLMs) in medical contexts remain critical concerns.
Current evaluation methods often lack robustness and fail to provide a comprehensive assessment of LLM performance.
We propose Med-CoDE, a specifically designed evaluation framework for medical LLMs to address these challenges.
arXiv Detail & Related papers (2025-04-21T16:51:11Z) - Bayesian dynamic borrowing considering semantic similarity between outcomes for disproportionality analysis in FAERS [0.0]
We present a Bayesian dynamic borrowing (BDB) approach to enhance the quantitative identification of adverse events (AEs) in spontaneous reporting systems (SRSs)
The method embeds a robust meta-analytic predictive (MAP) prior within a Bayesian hierarchical model and incorporates semantic similarity measures (SSMs)
Using data from the FDA Adverse Event Reporting System (FAERS) between 2015 and 2019, we evaluate this approach against standard Information Component (IC) analysis and IC with borrowing at the MedDRA high-level group term (HLGT) level.
arXiv Detail & Related papers (2025-04-16T13:06:24Z) - Quantifying the Reasoning Abilities of LLMs on Real-world Clinical Cases [48.87360916431396]
We introduce MedR-Bench, a benchmarking dataset of 1,453 structured patient cases, annotated with reasoning references.<n>We propose a framework encompassing three critical examination recommendation, diagnostic decision-making, and treatment planning, simulating the entire patient care journey.<n>Using this benchmark, we evaluate five state-of-the-art reasoning LLMs, including DeepSeek-R1, OpenAI-o3-mini, and Gemini-2.0-Flash Thinking, etc.
arXiv Detail & Related papers (2025-03-06T18:35:39Z) - An LLM-Powered Agent for Physiological Data Analysis: A Case Study on PPG-based Heart Rate Estimation [2.0195680688695594]
We develop an LLM-powered agent for physiological time-series analysis.<n>Built on the OpenCHA framework, our agent features an orchestrator that integrates user interaction, data sources, and analytical tools.<n>Results demonstrate that our agent significantly outperforms benchmark models by achieving lower error rates and more reliable HR estimations.
arXiv Detail & Related papers (2025-02-18T13:09:59Z) - MedHallBench: A New Benchmark for Assessing Hallucination in Medical Large Language Models [0.0]
Medical Large Language Models (MLLMs) have demonstrated potential in healthcare applications.<n>Their propensity for hallucinations presents substantial risks to patient care.<n>This paper introduces MedHallBench, a comprehensive benchmark framework for evaluating and mitigating hallucinations in MLLMs.
arXiv Detail & Related papers (2024-12-25T16:51:29Z) - Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval [61.70489848327436]
KARE is a novel framework that integrates knowledge graph (KG) community-level retrieval with large language models (LLMs) reasoning.
Extensive experiments demonstrate that KARE outperforms leading models by up to 10.8-15.0% on MIMIC-III and 12.6-12.7% on MIMIC-IV for mortality and readmission predictions.
arXiv Detail & Related papers (2024-10-06T18:46:28Z) - A Spectrum Evaluation Benchmark for Medical Multi-Modal Large Language Models [57.88111980149541]
We introduce Asclepius, a novel Med-MLLM benchmark that assesses Med-MLLMs in terms of distinct medical specialties and different diagnostic capacities.<n>Grounded in 3 proposed core principles, Asclepius ensures a comprehensive evaluation by encompassing 15 medical specialties.<n>We also provide an in-depth analysis of 6 Med-MLLMs and compare them with 3 human specialists.
arXiv Detail & Related papers (2024-02-17T08:04:23Z) - Simulation-based Inference for Cardiovascular Models [43.55219268578912]
We use simulation-based inference to solve the inverse problem of mapping waveforms back to plausible physiological parameters.<n>We perform an in-silico uncertainty analysis of five biomarkers of clinical interest.<n>We study the gap between in-vivo and in-silico with the MIMIC-III waveform database.
arXiv Detail & Related papers (2023-07-26T02:34:57Z) - FineEHR: Refine Clinical Note Representations to Improve Mortality
Prediction [3.9026461169566673]
Large-scale electronic health records provide machine learning models with an abundance of clinical text and vital sign data.
Despite the emergence of advanced Natural Language Processing (NLP) algorithms for clinical note analysis, the complex textual structure and noise present in raw clinical data have posed significant challenges.
We propose FINEEHR, a system that utilizes two representation learning techniques, namely metric learning and fine-tuning, to refine clinical note embeddings.
arXiv Detail & Related papers (2023-04-24T02:42:52Z) - Large Language Models for Healthcare Data Augmentation: An Example on
Patient-Trial Matching [49.78442796596806]
We propose an innovative privacy-aware data augmentation approach for patient-trial matching (LLM-PTM)
Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%.
arXiv Detail & Related papers (2023-03-24T03:14:00Z) - Auditing Algorithmic Fairness in Machine Learning for Health with
Severity-Based LOGAN [70.76142503046782]
We propose supplementing machine learning-based (ML) healthcare tools for bias with SLOGAN, an automatic tool for capturing local biases in a clinical prediction task.
LOGAN adapts an existing tool, LOcal Group biAs detectioN, by contextualizing group bias detection in patient illness severity and past medical history.
On average, SLOGAN identifies larger fairness disparities in over 75% of patient groups than LOGAN while maintaining clustering quality.
arXiv Detail & Related papers (2022-11-16T08:04:12Z) - Multiple Sclerosis Severity Classification From Clinical Text [5.8335613930036265]
We present MS-BERT, the first publicly available transformer model trained on real clinical data other than MIMIC.
Next, we present MSBC, a classifier that applies MS-BERT to generate embeddings and predict EDSS and functional subscores.
Finally, we explore combining MSBC with other models through the use of Snorkel to generate scores for unlabelled consult notes.
arXiv Detail & Related papers (2020-10-29T02:15:23Z) - MIA-Prognosis: A Deep Learning Framework to Predict Therapy Response [58.0291320452122]
This paper aims at a unified deep learning approach to predict patient prognosis and therapy response.
We formalize the prognosis modeling as a multi-modal asynchronous time series classification task.
Our predictive model could further stratify low-risk and high-risk patients in terms of long-term survival.
arXiv Detail & Related papers (2020-10-08T15:30:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.