Related papers: Machine Unlearning for Speaker-Agnostic Detection of Gender-Based Violence Condition in Speech

Machine Unlearning for Speaker-Agnostic Detection of Gender-Based Violence Condition in Speech

URL: http://arxiv.org/abs/2411.18177v2
Date: Fri, 26 Sep 2025 09:32:26 GMT
Title: Machine Unlearning for Speaker-Agnostic Detection of Gender-Based Violence Condition in Speech
Authors: Emma Reyner-Fuentes, Esther Rituerto-Gonzalez, Carmen Pelaez-Moreno,
Abstract summary: Gender-based violence is a pervasive public health issue that severely impacts women's mental health.<n>This study introduces a speaker-agnostic approach to detecting the gender-based violence victim condition from speech.
Score: 0.5352699766206809
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Gender-based violence is a pervasive public health issue that severely impacts women's mental health, often leading to conditions such as in anxiety, depression, post-traumatic stress disorder, and substance abuse. Identifying the combination of these various mental health conditions could then point to someone who is a victim of gender-based violence. And while speech-based artificial intelligence tools show as a promising solution for mental health screening, their performance often deteriorates when encountering speech from previously unseen speakers, a sign that speaker traits may be confounding factors. This study introduces a speaker-agnostic approach to detecting the gender-based violence victim condition from speech, aiming to develop robust artificial intelligence models capable of generalizing across speakers. By employing domain-adversarial training, we reduce the influence of speaker identity on model predictions, we achieve a 26.95% relative reduction in speaker identification accuracy while improving gender-based violence victim condition classification accuracy by 6.37% (relative). These results suggest that our models effectively capture paralinguistic biomarkers linked to the gender-based violence victim condition, rather than speaker-specific traits. Additionally, the model's predictions show moderate correlation with pre-clinical post-traumatic stress disorder symptoms, supporting the relevance of speech as a non-invasive tool for mental health monitoring. This work lays the foundation for ethical, privacy-preserving artificial intelligence systems to support clinical screening of gender-based violence survivors.

Related papers

DepFlow: Disentangled Speech Generation to Mitigate Semantic Bias in Depression Detection [54.209716321122194]
We present DepFlow, a depression-conditioned text-to-speech framework.<n>A Depression Acoustic Camouflage learns speaker- and content-invariant depression embeddings through adversarial training.<n>A flow-matching TTS model with FiLM modulation injects these embeddings into synthesis, enabling control over depressive severity.<n>A prototype-based severity mapping mechanism provides smooth and interpretable manipulation across the depression continuum.
arXiv Detail & Related papers (2026-01-01T10:44:38Z)
Domain Adversarial Training for Mitigating Gender Bias in Speech-based Mental Health Detection [9.82676920954754]
We introduce a domain adversarial training approach that explicitly considers gender differences in speech-based depression and PTSD detection.<n> Experimental results show that our method notably improves detection performance, increasing the F1-score by up to 13.29 percentage points compared to the baseline.
arXiv Detail & Related papers (2025-05-06T09:29:14Z)
Towards Privacy-aware Mental Health AI Models: Advances, Challenges, and Opportunities [61.633126163190724]
Mental illness is a widespread and debilitating condition with substantial societal and personal costs. Recent advances in Artificial Intelligence (AI) hold great potential for recognizing and addressing conditions such as depression, anxiety disorder, bipolar disorder, schizophrenia, and post-traumatic stress disorder. Privacy concerns, including the risk of sensitive data leakage from datasets and trained models, remain a critical barrier to deploying these AI systems in real-world clinical settings.
arXiv Detail & Related papers (2025-02-01T15:10:02Z)
CBT-Bench: Evaluating Large Language Models on Assisting Cognitive Behavior Therapy [67.23830698947637]
We propose a new benchmark, CBT-BENCH, for the systematic evaluation of cognitive behavioral therapy (CBT) assistance. We include three levels of tasks in CBT-BENCH: I: Basic CBT knowledge acquisition, with the task of multiple-choice questions; II: Cognitive model understanding, with the tasks of cognitive distortion classification, primary core belief classification, and fine-grained core belief classification; III: Therapeutic response generation, with the task of generating responses to patient speech in CBT therapy sessions. Experimental results indicate that while LLMs perform well in reciting CBT knowledge, they fall short in complex real-world scenarios
arXiv Detail & Related papers (2024-10-17T04:52:57Z)
MentalArena: Self-play Training of Language Models for Diagnosis and Treatment of Mental Health Disorders [59.515827458631975]
Mental health disorders are one of the most serious diseases in the world.<n>Privacy concerns limit the accessibility of personalized treatment data.<n>MentalArena is a self-play framework to train language models.
arXiv Detail & Related papers (2024-10-09T13:06:40Z)
AI-Driven Early Mental Health Screening with Limited Data: Analyzing Selfies of Pregnant Women [32.514036618021244]
Major Depressive Disorder and anxiety disorders affect millions globally, contributing significantly to the burden of mental health issues. Early screening is crucial for effective intervention, as timely identification of mental health issues can significantly improve treatment outcomes. This study explores the potential of AI models for ubiquitous depression-anxiety screening given face-centric selfies.
arXiv Detail & Related papers (2024-10-07T19:34:25Z)
Exploring Gender-Specific Speech Patterns in Automatic Suicide Risk Assessment [39.26231968260796]
This study involves a novel dataset comprising speech recordings of 20 patients who read neutral texts. We extract four speech representations encompassing interpretable and deep features. By applying gender-exclusive modelling, features extracted from an emotion fine-tuned wav2vec2.0 model can be utilised to discriminate high- from low- suicide risk with a balanced accuracy of 81%.
arXiv Detail & Related papers (2024-06-26T12:51:28Z)
The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks [90.52808174102157]
In safety-critical applications such as medical imaging and autonomous driving, it is imperative to maintain both high adversarial robustness to protect against potential adversarial attacks. A notable knowledge gap remains concerning the uncertainty inherent in adversarially trained models. This study investigates the uncertainty of deep learning models by examining the performance of conformal prediction (CP) in the context of standard adversarial attacks.
arXiv Detail & Related papers (2024-05-14T18:05:19Z)
Non-Invasive Suicide Risk Prediction Through Speech Analysis [74.8396086718266]
We present a non-invasive, speech-based approach for automatic suicide risk assessment. We extract three sets of features, including wav2vec, interpretable speech and acoustic features, and deep learning-based spectral representations. Our most effective speech model achieves a balanced accuracy of $66.2,%$.
arXiv Detail & Related papers (2024-04-18T12:33:57Z)
Empowering Psychotherapy with Large Language Models: Cognitive Distortion Detection through Diagnosis of Thought Prompting [82.64015366154884]
We study the task of cognitive distortion detection and propose the Diagnosis of Thought (DoT) prompting. DoT performs diagnosis on the patient's speech via three stages: subjectivity assessment to separate the facts and the thoughts; contrastive reasoning to elicit the reasoning processes supporting and contradicting the thoughts; and schema analysis to summarize the cognition schemas. Experiments demonstrate that DoT obtains significant improvements over ChatGPT for cognitive distortion detection, while generating high-quality rationales approved by human experts.
arXiv Detail & Related papers (2023-10-11T02:47:21Z)
Beyond Neural-on-Neural Approaches to Speaker Gender Protection [2.741893145546753]
We show the importance of testing gender inference attacks based on speech features. We argue that researchers should use speech features to gain insight into how protective modifications change the speech signal.
arXiv Detail & Related papers (2023-06-30T14:26:49Z)
Harnessing the Power of Hugging Face Transformers for Predicting Mental Health Disorders in Social Networks [0.0]
This study explores how user-generated data can be used to predict mental disorder symptoms. Our study compares four different BERT models of Hugging Face with standard machine learning techniques. New models outperform the previous approach with an accuracy rate of up to 97%.
arXiv Detail & Related papers (2023-06-29T12:25:19Z)
DEPAC: a Corpus for Depression and Anxiety Detection from Speech [3.2154432166999465]
We introduce a novel mental distress analysis audio dataset DEPAC, labeled based on established thresholds on depression and anxiety screening tools. This large dataset comprises multiple speech tasks per individual, as well as relevant demographic information. We present a feature set consisting of hand-curated acoustic and linguistic features, which were found effective in identifying signs of mental illnesses in human speech.
arXiv Detail & Related papers (2023-06-20T12:21:06Z)
Leveraging Pretrained Representations with Task-related Keywords for Alzheimer's Disease Detection [69.53626024091076]
Alzheimer's disease (AD) is particularly prominent in older adults. Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations. This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features.
arXiv Detail & Related papers (2023-03-14T16:03:28Z)
Cost-effective Models for Detecting Depression from Speech [4.269268432906194]
Depression is the most common psychological disorder and is considered as a leading cause of disability and suicide worldwide. A system capable of detecting signs of depression in human speech can contribute to ensuring timely and effective mental health care for individuals suffering from the disorder.
arXiv Detail & Related papers (2023-02-18T02:46:21Z)
Bias Reducing Multitask Learning on Mental Health Prediction [18.32551434711739]
There has been an increase in research in developing machine learning models for mental health detection or prediction. In this work, we aim to perform a fairness analysis and implement a multi-task learning based bias mitigation method on anxiety prediction models. Our analysis showed that our anxiety prediction base model introduced some bias with regards to age, income, ethnicity, and whether a participant is born in the U.S. or not.
arXiv Detail & Related papers (2022-08-07T02:28:32Z)
Speaker Identity Preservation in Dysarthric Speech Reconstruction by Adversarial Speaker Adaptation [59.41186714127256]
Dysarthric speech reconstruction (DSR) aims to improve the quality of dysarthric speech. Speaker encoder (SE) optimized for speaker verification has been explored to control the speaker identity. We propose a novel multi-task learning strategy, i.e., adversarial speaker adaptation (ASA)
arXiv Detail & Related papers (2022-02-18T08:59:36Z)
Investigation of Data Augmentation Techniques for Disordered Speech Recognition [69.50670302435174]
This paper investigates a set of data augmentation techniques for disordered speech recognition. Both normal and disordered speech were exploited in the augmentation process. The final speaker adapted system constructed using the UASpeech corpus and the best augmentation approach based on speed perturbation produced up to 2.92% absolute word error rate (WER)
arXiv Detail & Related papers (2022-01-14T17:09:22Z)
Protecting gender and identity with disentangled speech representations [49.00162808063399]
We show that protecting gender information in speech is more effective than modelling speaker-identity information. We present a novel way to encode gender information and disentangle two sensitive biometric identifiers.
arXiv Detail & Related papers (2021-04-22T13:31:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.