Holistix: A Dataset for Holistic Wellness Dimensions Analysis in Mental Health Narratives
- URL: http://arxiv.org/abs/2507.09565v2
- Date: Thu, 17 Jul 2025 06:11:39 GMT
- Title: Holistix: A Dataset for Holistic Wellness Dimensions Analysis in Mental Health Narratives
- Authors: Heba Shakeel, Tanvir Ahmad, Chandni Saxena,
- Abstract summary: We introduce a dataset for classifying wellness dimensions in social media user posts, covering six key aspects: physical, emotional, social, intellectual, spiritual, and vocational.<n>The dataset is designed to capture these dimensions in user-generated content, with a comprehensive annotation framework developed under the guidance of domain experts.<n>We evaluate both traditional machine learning models and advanced transformer-based models for this multi-class classification task, with performance assessed using precision, recall, and F1-score, averaged over 10-fold cross-validation.
- Score: 1.0446041735532203
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a dataset for classifying wellness dimensions in social media user posts, covering six key aspects: physical, emotional, social, intellectual, spiritual, and vocational. The dataset is designed to capture these dimensions in user-generated content, with a comprehensive annotation framework developed under the guidance of domain experts. This framework allows for the classification of text spans into the appropriate wellness categories. We evaluate both traditional machine learning models and advanced transformer-based models for this multi-class classification task, with performance assessed using precision, recall, and F1-score, averaged over 10-fold cross-validation. Post-hoc explanations are applied to ensure the transparency and interpretability of model decisions. The proposed dataset contributes to region-specific wellness assessments in social media and paves the way for personalized well-being evaluations and early intervention strategies in mental health. We adhere to ethical considerations for constructing and releasing our experiments and dataset publicly on Github.
Related papers
- Contextual Embedding-based Clustering to Identify Topics for Healthcare Service Improvement [3.9726806016869936]
This study explores unsupervised methods to extract meaningful topics from 439 survey responses collected from a healthcare system in Wisconsin, USA.<n>A keyword-based filtering approach was applied to isolate complaint-related feedback using a domain-specific lexicon.<n>To improve coherence and interpretability where data are scarce and consist of short-texts, we propose kBERT.
arXiv Detail & Related papers (2025-04-18T20:38:24Z) - Are LLMs effective psychological assessors? Leveraging adaptive RAG for interpretable mental health screening through psychometric practice [2.9775344067885974]
In psychological practices, standardized questionnaires serve as essential tools for assessing mental health through structured, clinically-validated questions.<n>We propose a novel questionnaire-guided screening framework that bridges psychological practice and computational methods.<n>Our approach links unstructured social media content and standardized clinical assessments by retrieving relevant posts for each questionnaire item.
arXiv Detail & Related papers (2025-01-02T00:01:54Z) - Pitfalls of topology-aware image segmentation [81.19923502845441]
We identify critical pitfalls in model evaluation that include inadequate connectivity choices, overlooked topological artifacts, and inappropriate use of evaluation metrics.<n>We propose a set of actionable recommendations to establish fair and robust evaluation standards for topology-aware medical image segmentation methods.
arXiv Detail & Related papers (2024-12-19T08:11:42Z) - Diagnosing Medical Datasets with Training Dynamics [0.0]
This study explores the potential of using training dynamics as an automated alternative to human annotation.
The framework used is Data Maps, which classifies data points into categories such as easy-to-learn, hard-to-learn, and ambiguous.
A comprehensive evaluation was conducted to assess the feasibility and transferability of the Data Maps framework to the medical domain.
arXiv Detail & Related papers (2024-11-03T18:37:35Z) - Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad Prediction [54.23208041792073]
Aspect Sentiment Quad Prediction (ASQP) aims to predict all quads (aspect term, aspect category, opinion term, sentiment polarity) for a given review.
A key challenge in the ASQP task is the scarcity of labeled data, which limits the performance of existing methods.
We propose a self-training framework with a pseudo-label scorer, wherein a scorer assesses the match between reviews and their pseudo-labels.
arXiv Detail & Related papers (2024-06-26T05:30:21Z) - Prospector Heads: Generalized Feature Attribution for Large Models & Data [82.02696069543454]
We introduce prospector heads, an efficient and interpretable alternative to explanation-based attribution methods.
We demonstrate how prospector heads enable improved interpretation and discovery of class-specific patterns in input data.
arXiv Detail & Related papers (2024-02-18T23:01:28Z) - WellXplain: Wellness Concept Extraction and Classification in Reddit
Posts for Mental Health Analysis [8.430481660019451]
In traditional therapy sessions, professionals manually pinpoint the origins and outcomes of underlying mental challenges.
We introduce an approach to this intricate mental health analysis by framing the identification of wellness dimensions in Reddit content as a wellness concept extraction and categorization challenge.
We've curated a unique dataset named WELLXPLAIN, comprising 3,092 entries and totaling 72,813 words.
arXiv Detail & Related papers (2023-08-25T23:50:05Z) - MET: Multimodal Perception of Engagement for Telehealth [52.54282887530756]
We present MET, a learning-based algorithm for perceiving a human's level of engagement from videos.
We release a new dataset, MEDICA, for mental health patient engagement detection.
arXiv Detail & Related papers (2020-11-17T15:18:38Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z) - Jointly Predicting Job Performance, Personality, Cognitive Ability,
Affect, and Well-Being [42.67003631848889]
We create a benchmark for predictive analysis of individuals from a perspective that integrates physical and physiological behavior, psychological states and traits, and job performance.
We design data mining techniques as benchmark and uses real noisy and incomplete data derived from wearable sensors to predict 19 constructs based on 12 standardized well-validated tests.
arXiv Detail & Related papers (2020-06-10T14:30:29Z) - DeepCoDA: personalized interpretability for compositional health data [58.841559626549376]
Interpretability allows the domain-expert to evaluate the model's relevance and reliability.
In the healthcare setting, interpretable models should implicate relevant biological mechanisms independent of technical factors.
We define personalized interpretability as a measure of sample-specific feature attribution.
arXiv Detail & Related papers (2020-06-02T05:14:22Z) - A Revised Generative Evaluation of Visual Dialogue [80.17353102854405]
We propose a revised evaluation scheme for the VisDial dataset.
We measure consensus between answers generated by the model and a set of relevant answers.
We release these sets and code for the revised evaluation scheme as DenseVisDial.
arXiv Detail & Related papers (2020-04-20T13:26:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.