Related papers: When Curiosity Signals Danger: Predicting Health Crises Through Online Medication Inquiries

When Curiosity Signals Danger: Predicting Health Crises Through Online Medication Inquiries

URL: http://arxiv.org/abs/2509.11802v1
Date: Mon, 15 Sep 2025 11:31:25 GMT
Title: When Curiosity Signals Danger: Predicting Health Crises Through Online Medication Inquiries
Authors: Dvora Goncharok, Arbel Shifman, Alexander Apartsin, Yehudit Aperstein,
Abstract summary: This study introduces a novel annotated dataset of medication-related questions extracted from online forums.<n>Each entry is manually labelled for criticality based on clinical risk factors.<n>Results highlight the potential of classical and modern methods to support real-time triage and alert systems.
Score: 40.12543056558646
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Online medical forums are a rich and underutilized source of insight into patient concerns, especially regarding medication use. Some of the many questions users pose may signal confusion, misuse, or even the early warning signs of a developing health crisis. Detecting these critical questions that may precede severe adverse events or life-threatening complications is vital for timely intervention and improving patient safety. This study introduces a novel annotated dataset of medication-related questions extracted from online forums. Each entry is manually labelled for criticality based on clinical risk factors. We benchmark the performance of six traditional machine learning classifiers using TF-IDF textual representations, alongside three state-of-the-art large language model (LLM)-based classification approaches that leverage deep contextual understanding. Our results highlight the potential of classical and modern methods to support real-time triage and alert systems in digital health spaces. The curated dataset is made publicly available to encourage further research at the intersection of patient-generated data, natural language processing, and early warning systems for critical health events. The dataset and benchmark are available at: https://github.com/Dvora-coder/LLM-Medication-QA-Risk-Classifier-MediGuard.

Related papers

RSD-15K: A Large-Scale User-Level Annotated Dataset for Suicide Risk Detection on Social Media [0.0]
Social media is an important platform for individuals to express emotions and seek help.<n>This paper introduces a large-scale dataset containing 15,000 user-level posts.<n>Compared with existing datasets, this dataset retains complete user posting time sequence information.
arXiv Detail & Related papers (2025-07-14T09:26:26Z)
Privacy-Aware, Public-Aligned: Embedding Risk Detection and Public Values into Scalable Clinical Text De-Identification for Trusted Research Environments [0.0]
We show how direct and indirect identifiers vary by record type, clinical setting, and data flow, and show how changes in documentation practice can degrade model performance over time.<n>Our findings highlight that privacy risk is context-dependent and cumulative, underscoring the need for adaptable, hybrid de-identification approaches.
arXiv Detail & Related papers (2025-06-01T17:45:57Z)
Uncertainty-aware abstention in medical diagnosis based on medical texts [87.88110503208016]
This study addresses the critical issue of reliability for AI-assisted medical diagnosis.<n>We focus on the selection prediction approach that allows the diagnosis system to abstain from providing the decision if it is not confident in the diagnosis.<n>We introduce HUQ-2, a new state-of-the-art method for enhancing reliability in selective prediction tasks.
arXiv Detail & Related papers (2025-02-25T10:15:21Z)
Confidential and Protected Disease Classifier using Fully Homomorphic Encryption [0.09424565541639365]
Many users seek potential causes on platforms like ChatGPT or Bard before consulting a medical professional for their ailment. Despite the convenience of such platforms, sharing personal medical data online poses risks, including the presence of malicious platforms. We propose a novel framework combining FHE and Deep Learning for a secure and private diagnosis system.
arXiv Detail & Related papers (2024-05-05T02:10:00Z)
Uncertainty-aware Medical Diagnostic Phrase Identification and Grounding [72.18719355481052]
We introduce a novel task called Medical Report Grounding (MRG)<n>MRG aims to directly identify diagnostic phrases and their corresponding grounding boxes from medical reports in an end-to-end manner.<n>We propose uMedGround, a robust and reliable framework that leverages a multimodal large language model to predict diagnostic phrases.
arXiv Detail & Related papers (2024-04-10T07:41:35Z)
Informing clinical assessment by contextualizing post-hoc explanations of risk prediction models in type-2 diabetes [50.8044927215346]
We consider a comorbidity risk prediction scenario and focus on contexts regarding the patients clinical state. We employ several state-of-the-art LLMs to present contexts around risk prediction model inferences and evaluate their acceptability. Our paper is one of the first end-to-end analyses identifying the feasibility and benefits of contextual explanations in a real-world clinical use case.
arXiv Detail & Related papers (2023-02-11T18:07:11Z)
Can Current Explainability Help Provide References in Clinical Notes to Support Humans Annotate Medical Codes? [53.45585591262433]
We present an explainable Read, Attend, and Code (xRAC) framework and assess two approaches, attention score-based xRAC-ATTN and model-agnostic knowledge-distillation-based xRAC-KD. We find that the supporting evidence text highlighted by xRAC-ATTN is of higher quality than xRAC-KD whereas xRAC-KD has potential advantages in production deployment scenarios.
arXiv Detail & Related papers (2022-10-28T04:06:07Z)
Federated Learning for Medical Applications: A Taxonomy, Current Trends, Challenges, and Future Research Directions [9.662980267339375]
We focus on medical applications of acFL, particularly in the context of global cancer diagnosis. Recent developments in acFL have made it possible to train complex machine-learned models in a distributed manner.
arXiv Detail & Related papers (2022-08-05T21:41:15Z)
Classifying Cyber-Risky Clinical Notes by Employing Natural Language Processing [9.77063694539068]
Recently, some states within the United States of America require patients to have open access to their clinical notes. This research investigates methods for identifying security/privacy risks within clinical notes.
arXiv Detail & Related papers (2022-03-24T00:36:59Z)
BiteNet: Bidirectional Temporal Encoder Network to Predict Medical Outcomes [53.163089893876645]
We propose a novel self-attention mechanism that captures the contextual dependency and temporal relationships within a patient's healthcare journey. An end-to-end bidirectional temporal encoder network (BiteNet) then learns representations of the patient's journeys. We have evaluated the effectiveness of our methods on two supervised prediction and two unsupervised clustering tasks with a real-world EHR dataset.
arXiv Detail & Related papers (2020-09-24T00:42:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.