Uncovering Intervention Opportunities for Suicide Prevention with Language Model Assistants
- URL: http://arxiv.org/abs/2508.18541v2
- Date: Fri, 29 Aug 2025 21:59:37 GMT
- Title: Uncovering Intervention Opportunities for Suicide Prevention with Language Model Assistants
- Authors: Jaspreet Ranjit, Hyundong J. Cho, Claire J. Smerdon, Yoonsoo Nam, Myles Phung, Jonathan May, John R. Blosnich, Swabha Swayamdipta,
- Abstract summary: We investigate the value of language models (LMs) as efficient assistants to data annotators and experts.<n>We find that LM predictions match existing data annotations about 85% of the time across 50 NVDRS variables.<n>We introduce a human-in-the-loop algorithm to assist experts in efficiently building and refining guidelines for annotating new variables.
- Score: 29.817116812169132
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Warning: This paper discusses topics of suicide and suicidal ideation, which may be distressing to some readers. The National Violent Death Reporting System (NVDRS) documents information about suicides in the United States, including free text narratives (e.g., circumstances surrounding a suicide). In a demanding public health data pipeline, annotators manually extract structured information from death investigation records following extensive guidelines developed painstakingly by experts. In this work, we facilitate data-driven insights from the NVDRS data to support the development of novel suicide interventions by investigating the value of language models (LMs) as efficient assistants to these (a) data annotators and (b) experts. We find that LM predictions match existing data annotations about 85% of the time across 50 NVDRS variables. In the cases where the LM disagrees with existing annotations, expert review reveals that LM assistants can surface annotation discrepancies 38% of the time. Finally, we introduce a human-in-the-loop algorithm to assist experts in efficiently building and refining guidelines for annotating new variables by allowing them to focus only on providing feedback for incorrect LM predictions. We apply our algorithm to a real-world case study for a new variable that characterizes victim interactions with lawyers and demonstrate that it achieves comparable annotation quality with a laborious manual approach. Our findings provide evidence that LMs can serve as effective assistants to public health researchers who handle sensitive data in high-stakes scenarios.
Related papers
- Rethinking Suicidal Ideation Detection: A Trustworthy Annotation Framework and Cross-Lingual Model Evaluation [0.0]
Suicidal ideation detection is critical for real-time suicide prevention, yet its progress faces two under-explored challenges.<n>Most available datasets are in English, but even among these, high-quality, human-annotated data remains scarce.
arXiv Detail & Related papers (2025-07-19T16:54:36Z) - TextSleuth: Towards Explainable Tampered Text Detection [49.88698441048043]
We propose to explain the basis of tampered text detection with natural language via large multimodal models.<n>To fill the data gap for this task, we propose a large-scale, comprehensive dataset, ETTD.<n>Elaborate queries are introduced to generate high-quality anomaly descriptions with GPT4o.<n>To automatically filter out low-quality annotations, we also propose to prompt GPT4o to recognize tampered texts.
arXiv Detail & Related papers (2024-12-19T13:10:03Z) - DRS: Deep Question Reformulation With Structured Output [133.24623742929776]
Large language models (LLMs) can detect unanswerable questions, but struggle to assist users in reformulating these questions.<n>We propose DRS: Deep Question Reformulation with Structured Output, a novel zero-shot method to assist users in reformulating questions.<n>We show that DRS improves the reformulation accuracy of GPT-3.5 from $23.03%$ to $70.42%$, while also enhancing the performance of open-source models.
arXiv Detail & Related papers (2024-11-27T02:20:44Z) - Usefulness of LLMs as an Author Checklist Assistant for Scientific Papers: NeurIPS'24 Experiment [59.09144776166979]
Large language models (LLMs) represent a promising, but controversial, tool in aiding scientific peer review.
This study evaluates the usefulness of LLMs in a conference setting as a tool for vetting paper submissions against submission standards.
arXiv Detail & Related papers (2024-11-05T18:58:00Z) - Investigating Annotator Bias in Large Language Models for Hate Speech Detection [5.589665886212444]
This paper delves into the biases present in Large Language Models (LLMs) when annotating hate speech data.
Specifically targeting highly vulnerable groups within these categories, we analyze annotator biases.
We introduce our custom hate speech detection dataset, HateBiasNet, to conduct this research.
arXiv Detail & Related papers (2024-06-17T00:18:31Z) - Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models [52.368110271614285]
We introduce AdvEval, a novel black-box adversarial framework against NLG evaluators.
AdvEval is specially tailored to generate data that yield strong disagreements between human and victim evaluators.
We conduct experiments on 12 victim evaluators and 11 NLG datasets, spanning tasks including dialogue, summarization, and question evaluation.
arXiv Detail & Related papers (2024-05-23T14:48:15Z) - Non-Invasive Suicide Risk Prediction Through Speech Analysis [74.8396086718266]
We present a non-invasive, speech-based approach for automatic suicide risk assessment.
We extract three sets of features, including wav2vec, interpretable speech and acoustic features, and deep learning-based spectral representations.
Our most effective speech model achieves a balanced accuracy of $66.2,%$.
arXiv Detail & Related papers (2024-04-18T12:33:57Z) - From Narratives to Numbers: Valid Inference Using Language Model Predictions from Verbal Autopsy Narratives [5.730469631341288]
We develop a method for valid inference using outcomes predicted from free-form text using state-of-the-art NLP techniques.
We leverage a suite of NLP techniques for COD prediction and, through empirical analysis of VA data, demonstrate the effectiveness of our approach in handling transportability issues.
arXiv Detail & Related papers (2024-04-03T03:53:37Z) - Uncovering Misattributed Suicide Causes through Annotation Inconsistency Detection in Death Investigation Notes [21.374488755816092]
The National Violent Death Reporting System (NVDRS) data is widely used for discovering the patterns and causes of death.
Recent studies suggested the annotation inconsistencies within the NVDRS and the potential impact on erroneous suicide-cause attributions.
We present an empirical Natural Language Processing (NLP) approach to detect annotation inconsistencies and adopt a cross-validation-like paradigm to identify problematic instances.
arXiv Detail & Related papers (2024-03-28T14:03:12Z) - Self-Verification Improves Few-Shot Clinical Information Extraction [73.6905567014859]
Large language models (LLMs) have shown the potential to accelerate clinical curation via few-shot in-context learning.
They still struggle with issues regarding accuracy and interpretability, especially in mission-critical domains such as health.
Here, we explore a general mitigation framework using self-verification, which leverages the LLM to provide provenance for its own extraction and check its own outputs.
arXiv Detail & Related papers (2023-05-30T22:05:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.