Promoting the Responsible Development of Speech Datasets for Mental Health and Neurological Disorders Research
- URL: http://arxiv.org/abs/2406.04116v2
- Date: Mon, 17 Feb 2025 12:44:01 GMT
- Title: Promoting the Responsible Development of Speech Datasets for Mental Health and Neurological Disorders Research
- Authors: Eleonora Mancini, Ana Tanevska, Andrea Galassi, Alessio Galatolo, Federico Ruggeri, Paolo Torroni,
- Abstract summary: We chart the landscape of available speech datasets for mental health and neurological disorders.<n>We distill it into an actionable checklist focused on ethical concerns to foster more responsible research.
- Score: 10.939564452457896
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current research in machine learning and artificial intelligence is largely centered on modeling and performance evaluation, less so on data collection. However, recent research demonstrated that limitations and biases in data may negatively impact trustworthiness and reliability. These aspects are particularly impactful on sensitive domains such as mental health and neurological disorders, where speech data are used to develop AI applications for patients and healthcare providers. In this paper, we chart the landscape of available speech datasets for this domain, to highlight possible pitfalls and opportunities for improvement and promote fairness and diversity. We present a comprehensive list of desiderata for building speech datasets for mental health and neurological disorders and distill it into an actionable checklist focused on ethical concerns to foster more responsible research.
Related papers
- Systematic FAIRness Assessment of Open Voice Biomarker Datasets for Mental Health and Neurodegenerative Diseases [0.0]
Voice biomarkers are promising tools for non-invasive detection and monitoring of mental health and neurodegenerative diseases.<n>We present the first systematic FAIR evaluation of 27 publicly available voice biomarker datasets.<n>Mental health datasets exhibited greater variability in FAIR scores, while neurodegenerative datasets were slightly more consistent.
arXiv Detail & Related papers (2025-08-14T06:55:27Z) - A Comprehensive Review of Datasets for Clinical Mental Health AI Systems [55.67299586253951]
We present the first comprehensive survey of clinical mental health datasets relevant to the training and development of AI-powered clinical assistants.<n>Our survey identifies critical gaps such as a lack of longitudinal data, limited cultural and linguistic representation, inconsistent collection and annotation standards, and a lack of modalities in synthetic data.
arXiv Detail & Related papers (2025-08-13T13:42:35Z) - MentalChat16K: A Benchmark Dataset for Conversational Mental Health Assistance [13.373260490163709]
MentalChat16K is an English benchmark dataset combining a synthetic mental health counseling dataset and a dataset of anonymized transcripts from interventions between Behavioral Health Coaches and Caregivers of patients in palliative or hospice care.
Covering a diverse range of conditions like depression, anxiety, and grief, this curated dataset is designed to facilitate the development and evaluation of large language models for conversational mental health assistance.
arXiv Detail & Related papers (2025-03-13T20:25:10Z) - Towards Privacy-aware Mental Health AI Models: Advances, Challenges, and Opportunities [61.633126163190724]
Mental illness is a widespread and debilitating condition with substantial societal and personal costs.
Recent advances in Artificial Intelligence (AI) hold great potential for recognizing and addressing conditions such as depression, anxiety disorder, bipolar disorder, schizophrenia, and post-traumatic stress disorder.
Privacy concerns, including the risk of sensitive data leakage from datasets and trained models, remain a critical barrier to deploying these AI systems in real-world clinical settings.
arXiv Detail & Related papers (2025-02-01T15:10:02Z) - A Tutorial on Clinical Speech AI Development: From Data Collection to Model Validation [19.367198670893778]
This tutorial paper provides an overview of the key components required for robust development of clinical speech AI.
The goal is to provide comprehensive guidance on building models whose inputs and outputs link to the more interpretable and clinically meaningful aspects of speech.
arXiv Detail & Related papers (2024-10-29T00:58:15Z) - AI-Driven Healthcare: A Survey on Ensuring Fairness and Mitigating Bias [2.398440840890111]
AI applications have significantly improved diagnostic accuracy, treatment personalization, and patient outcome predictions.
These advancements also introduce substantial ethical and fairness challenges.
These biases can lead to disparities in healthcare delivery, affecting diagnostic accuracy and treatment outcomes across different demographic groups.
arXiv Detail & Related papers (2024-07-29T02:39:17Z) - TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets [57.067409211231244]
This paper presents meticulously curated AIready datasets covering multi-modal data (e.g., drug molecule, disease code, text, categorical/numerical features) and 8 crucial prediction challenges in clinical trial design.
We provide basic validation methods for each task to ensure the datasets' usability and reliability.
We anticipate that the availability of such open-access datasets will catalyze the development of advanced AI approaches for clinical trial design.
arXiv Detail & Related papers (2024-06-30T09:13:10Z) - Explainable AI for Mental Disorder Detection via Social Media: A survey and outlook [0.7689629183085726]
We conduct a thorough survey to explore the intersection of data science, artificial intelligence, and mental healthcare.
A significant portion of the population actively engages in online social media platforms, creating a vast repository of personal data.
The paper navigates through traditional diagnostic methods, state-of-the-art data- and AI-driven research studies, and the emergence of explainable AI (XAI) models for mental healthcare.
arXiv Detail & Related papers (2024-06-10T02:51:16Z) - Path-Specific Causal Reasoning for Fairness-aware Cognitive Diagnosis [45.935488572673215]
We design a novel Path-Specific Causal Reasoning Framework (PSCRF) to eliminate sensitive attributes of students.
Extensive experiments over real-world datasets (e.g., PISA dataset) demonstrate the effectiveness of our proposed PSCRF.
arXiv Detail & Related papers (2024-06-05T08:47:30Z) - A Survey of Artificial Intelligence in Gait-Based Neurodegenerative Disease Diagnosis [51.07114445705692]
neurodegenerative diseases (NDs) traditionally require extensive healthcare resources and human effort for medical diagnosis and monitoring.
As a crucial disease-related motor symptom, human gait can be exploited to characterize different NDs.
The current advances in artificial intelligence (AI) models enable automatic gait analysis for NDs identification and classification.
arXiv Detail & Related papers (2024-05-21T06:44:40Z) - Generative AI-Driven Human Digital Twin in IoT-Healthcare: A Comprehensive Survey [53.691704671844406]
The Internet of things (IoT) can significantly enhance the quality of human life, specifically in healthcare.
The human digital twin (HDT) is proposed as an innovative paradigm that can comprehensively characterize the replication of the individual human body.
HDT is envisioned to empower IoT-healthcare beyond the application of healthcare monitoring by acting as a versatile and vivid human digital testbed.
Recently, generative artificial intelligence (GAI) may be a promising solution because it can leverage advanced AI algorithms to automatically create, manipulate, and modify valuable while diverse data.
arXiv Detail & Related papers (2024-01-22T03:17:41Z) - DEPAC: a Corpus for Depression and Anxiety Detection from Speech [3.2154432166999465]
We introduce a novel mental distress analysis audio dataset DEPAC, labeled based on established thresholds on depression and anxiety screening tools.
This large dataset comprises multiple speech tasks per individual, as well as relevant demographic information.
We present a feature set consisting of hand-curated acoustic and linguistic features, which were found effective in identifying signs of mental illnesses in human speech.
arXiv Detail & Related papers (2023-06-20T12:21:06Z) - An Annotated Dataset for Explainable Interpersonal Risk Factors of
Mental Disturbance in Social Media Posts [0.0]
We construct and release a new annotated dataset with human-labelled explanations and classification of Interpersonal Risk Factors (IRF) affecting mental disturbance on social media.
We establish baseline models on our dataset facilitating future research directions to develop real-time personalized AI models by detecting patterns of TBe and PBu in emotional spectrum of user's historical social media profile.
arXiv Detail & Related papers (2023-05-30T04:08:40Z) - GDPR Compliant Collection of Therapist-Patient-Dialogues [48.091760741427656]
We elaborate on the challenges we faced in starting our collection of therapist-patient dialogues in a psychiatry clinic under the General Data Privacy Regulation of the European Union.
We give an overview of each step in our procedure and point out the potential pitfalls to motivate further research in this field.
arXiv Detail & Related papers (2022-11-22T15:51:10Z) - Data-Centric Epidemic Forecasting: A Survey [56.99209141838794]
This survey delves into various data-driven methodological and practical advancements.
We enumerate the large number of epidemiological datasets and novel data streams that are relevant to epidemic forecasting.
We also discuss experiences and challenges that arise in real-world deployment of these forecasting systems.
arXiv Detail & Related papers (2022-07-19T16:15:11Z) - Adaptive cognitive fit: Artificial intelligence augmented management of
information facets and representations [62.997667081978825]
Explosive growth in big data technologies and artificial intelligence [AI] applications have led to increasing pervasiveness of information facets.
Information facets, such as equivocality and veracity, can dominate and significantly influence human perceptions of information.
We suggest that artificially intelligent technologies that can adapt information representations to overcome cognitive limitations are necessary.
arXiv Detail & Related papers (2022-04-25T02:47:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.