Interpretable Graph-Language Modeling for Detecting Youth Illicit Drug Use
- URL: http://arxiv.org/abs/2510.15961v1
- Date: Sat, 11 Oct 2025 17:29:50 GMT
- Title: Interpretable Graph-Language Modeling for Detecting Youth Illicit Drug Use
- Authors: Yiyang Li, Zehong Wang, Zhengqing Yuan, Zheyuan Zhang, Keerthiram Murugesan, Chuxu Zhang, Yanfang Ye,
- Abstract summary: Illicit drug use among teenagers and young adults (TYAs) remains a pressing public health concern.<n>To detect illicit drug use among TYAs, researchers analyze large-scale surveys such as the Youth Risk Behavior Survey (YRBS) and the National Survey on Drug Use and Health (NSDUH)<n>Existing modeling methods treat survey variables independently, overlooking latent and interconnected structures among them.<n>We propose LAMI, a novel joint graph-language modeling framework for detecting illicit drug use and interpreting behavioral risk factors among TYAs.
- Score: 51.696842592898804
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Illicit drug use among teenagers and young adults (TYAs) remains a pressing public health concern, with rising prevalence and long-term impacts on health and well-being. To detect illicit drug use among TYAs, researchers analyze large-scale surveys such as the Youth Risk Behavior Survey (YRBS) and the National Survey on Drug Use and Health (NSDUH), which preserve rich demographic, psychological, and environmental factors related to substance use. However, existing modeling methods treat survey variables independently, overlooking latent and interconnected structures among them. To address this limitation, we propose LAMI (LAtent relation Mining with bi-modal Interpretability), a novel joint graph-language modeling framework for detecting illicit drug use and interpreting behavioral risk factors among TYAs. LAMI represents individual responses as relational graphs, learns latent connections through a specialized graph structure learning layer, and integrates a large language model to generate natural language explanations grounded in both graph structures and survey semantics. Experiments on the YRBS and NSDUH datasets show that LAMI outperforms competitive baselines in predictive accuracy. Interpretability analyses further demonstrate that LAMI reveals meaningful behavioral substructures and psychosocial pathways, such as family dynamics, peer influence, and school-related distress, that align with established risk factors for substance use.
Related papers
- Integrating Genomics into Multimodal EHR Foundation Models [56.31910745104141]
This paper introduces an innovative EHR foundation model that integrates Polygenic Risk Scores (PRS) as a foundational data modality.<n>The framework aims to learn complex relationships between clinical data and genetic predispositions.<n>This approach is pivotal for unlocking new insights into disease prediction, proactive health management, risk stratification, and personalized treatment strategies.
arXiv Detail & Related papers (2025-10-24T15:56:40Z) - Interpretable Machine Learning for Cognitive Aging: Handling Missing Data and Uncovering Social Determinant [28.20784930277189]
Early detection of Alzheimer's disease (AD) is crucial because its neurodegenerative effects are irreversible.<n>We predict cognitive performance from social determinants of health using the NIH NIA-supported PREPARE Challenge Phase 2 dataset.
arXiv Detail & Related papers (2025-10-13T03:04:10Z) - Spurious Correlations and Beyond: Understanding and Mitigating Shortcut Learning in SDOH Extraction with Large Language Models [3.3408746880885003]
Large language models (LLMs) have shown promise, but they may rely on superficial cues leading to spurious predictions.<n>We demonstrate that mentions of alcohol or smoking can falsely induce models to predict current/past drug use where none is present.<n>We evaluate mitigation strategies - such as prompt engineering and chain-of-thought reasoning - to reduce these false positives.
arXiv Detail & Related papers (2025-05-30T18:11:33Z) - Leveraging Social Determinants of Health in Alzheimer's Research Using LLM-Augmented Literature Mining and Knowledge Graphs [33.755845172595365]
Growing evidence suggests that social determinants of health (SDoH) affect individuals' risks of developing Alzheimer's disease (AD) and related dementias.<n>This study presents a novel, automated framework to mine SDoH knowledge from extensive literature and integrate it with AD-related biological entities.<n>Our framework shows promise for enhancing knowledge discovery in AD and can be generalized to other SDoH-related research areas.
arXiv Detail & Related papers (2024-10-04T21:39:30Z) - Groundedness in Retrieval-augmented Long-form Generation: An Empirical Study [61.74571814707054]
We evaluate whether every generated sentence is grounded in retrieved documents or the model's pre-training data.
Across 3 datasets and 4 model families, our findings reveal that a significant fraction of generated sentences are consistently ungrounded.
Our results show that while larger models tend to ground their outputs more effectively, a significant portion of correct answers remains compromised by hallucinations.
arXiv Detail & Related papers (2024-04-10T14:50:10Z) - PGraphDTA: Improving Drug Target Interaction Prediction using Protein
Language Models and Contact Maps [4.590060921188914]
Key aspect of drug discovery involves identifying novel drug-target (DT) interactions.
Protein-ligand interactions exhibit a continuum of binding strengths, known as binding affinity.
We propose novel enhancements to enhance their performance.
arXiv Detail & Related papers (2023-10-06T05:00:25Z) - Sensitivity, Performance, Robustness: Deconstructing the Effect of
Sociodemographic Prompting [64.80538055623842]
sociodemographic prompting is a technique that steers the output of prompt-based models towards answers that humans with specific sociodemographic profiles would give.
We show that sociodemographic information affects model predictions and can be beneficial for improving zero-shot learning in subjective NLP tasks.
arXiv Detail & Related papers (2023-09-13T15:42:06Z) - SciMON: Scientific Inspiration Machines Optimized for Novelty [68.46036589035539]
We explore and enhance the ability of neural language models to generate novel scientific directions grounded in literature.
We take a dramatic departure with a novel setting in which models use as input background contexts.
We present SciMON, a modeling framework that uses retrieval of "inspirations" from past scientific papers.
arXiv Detail & Related papers (2023-05-23T17:12:08Z) - Drug Synergistic Combinations Predictions via Large-Scale Pre-Training
and Graph Structure Learning [82.93806087715507]
Drug combination therapy is a well-established strategy for disease treatment with better effectiveness and less safety degradation.
Deep learning models have emerged as an efficient way to discover synergistic combinations.
Our framework achieves state-of-the-art results in comparison with other deep learning-based methods.
arXiv Detail & Related papers (2023-01-14T15:07:43Z) - Causal Inference via Nonlinear Variable Decorrelation for Healthcare
Applications [60.26261850082012]
We introduce a novel method with a variable decorrelation regularizer to handle both linear and nonlinear confounding.
We employ association rules as new representations using association rule mining based on the original features to increase model interpretability.
arXiv Detail & Related papers (2022-09-29T17:44:14Z) - Communicative Subgraph Representation Learning for Multi-Relational
Inductive Drug-Gene Interaction Prediction [17.478102754113294]
We propose a novel Communicative Subgraph representation learning for Multi-relational Inductive drug-Gene interactions prediction (CoSMIG)
The model strengthened the relations on the drug-gene graph through a communicative message passing mechanism.
Our method outperformed state-of-the-art baselines in the transductive scenarios and achieved superior performance in the inductive ones.
arXiv Detail & Related papers (2022-05-12T08:53:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.