Recommendations for Bayesian hierarchical model specifications for
case-control studies in mental health
- URL: http://arxiv.org/abs/2011.01725v1
- Date: Tue, 3 Nov 2020 14:19:59 GMT
- Title: Recommendations for Bayesian hierarchical model specifications for
case-control studies in mental health
- Authors: Vincent Valton, Toby Wise, Oliver J. Robinson
- Abstract summary: Researchers must choose whether to assume all subjects are drawn from a common population, or to model them as deriving from separate populations.
We ran systematic simulations on synthetic multi-group behavioural data from a commonly used bandit task.
We found that fitting groups separately provided the most accurate and robust inference across all conditions.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hierarchical model fitting has become commonplace for case-control studies of
cognition and behaviour in mental health. However, these techniques require us
to formalise assumptions about the data-generating process at the group level,
which may not be known. Specifically, researchers typically must choose whether
to assume all subjects are drawn from a common population, or to model them as
deriving from separate populations. These assumptions have profound
implications for computational psychiatry, as they affect the resulting
inference (latent parameter recovery) and may conflate or mask true group-level
differences. To test these assumptions we ran systematic simulations on
synthetic multi-group behavioural data from a commonly used multi-armed bandit
task (reinforcement learning task). We then examined recovery of group
differences in latent parameter space under the two commonly used generative
modelling assumptions: (1) modelling groups under a common shared group-level
prior (assuming all participants are generated from a common distribution, and
are likely to share common characteristics); (2) modelling separate groups
based on symptomatology or diagnostic labels, resulting in separate group-level
priors. We evaluated the robustness of these approaches to variations in data
quality and prior specifications on a variety of metrics. We found that fitting
groups separately (assumptions 2), provided the most accurate and robust
inference across all conditions. Our results suggest that when dealing with
data from multiple clinical groups, researchers should analyse patient and
control groups separately as it provides the most accurate and robust recovery
of the parameters of interest.
Related papers
- Model-based Clustering of Individuals' Ecological Momentary Assessment
Time-series Data for Improving Forecasting Performance [5.312303275762104]
It is believed that additional information of similar individuals is likely to enhance these models leading to better individuals' description.
Two model-based clustering approaches are examined, where the first is using model-extracted parameters of personalized models.
The superiority of clustering-based methods is confirmed, indicating that the utilization of group-based information could be effectively enhance the overall performance of all individuals' data.
arXiv Detail & Related papers (2023-10-11T13:39:04Z) - The Role of Subgroup Separability in Group-Fair Medical Image
Classification [18.29079361470428]
We find a relationship between subgroup separability, subgroup disparities, and performance degradation when models are trained on data with systematic bias such as underdiagnosis.
Our findings shed new light on the question of how models become biased, providing important insights for the development of fair medical imaging AI.
arXiv Detail & Related papers (2023-07-06T06:06:47Z) - Picking on the Same Person: Does Algorithmic Monoculture lead to Outcome
Homogenization? [90.35044668396591]
A recurring theme in machine learning is algorithmic monoculture: the same systems, or systems that share components, are deployed by multiple decision-makers.
We propose the component-sharing hypothesis: if decision-makers share components like training data or specific models, then they will produce more homogeneous outcomes.
We test this hypothesis on algorithmic fairness benchmarks, demonstrating that sharing training data reliably exacerbates homogenization.
We conclude with philosophical analyses of and societal challenges for outcome homogenization, with an eye towards implications for deployed machine learning systems.
arXiv Detail & Related papers (2022-11-25T09:33:11Z) - Composite Feature Selection using Deep Ensembles [130.72015919510605]
We investigate the problem of discovering groups of predictive features without predefined grouping.
We introduce a novel deep learning architecture that uses an ensemble of feature selection models to find predictive groups.
We propose a new metric to measure similarity between discovered groups and the ground truth.
arXiv Detail & Related papers (2022-11-01T17:49:40Z) - Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular
data [81.43750358586072]
We propose Data-IQ, a framework to systematically stratify examples into subgroups with respect to their outcomes.
We experimentally demonstrate the benefits of Data-IQ on four real-world medical datasets.
arXiv Detail & Related papers (2022-10-24T08:57:55Z) - Data-driven Model Generalizability in Crosslinguistic Low-resource
Morphological Segmentation [4.339613097080119]
In low-resource scenarios, artifacts of the data collection can yield data sets that are outliers, potentially making conclusions about model performance coincidental.
We compare three broad classes of models with different parameterizations, taking data from 11 languages across 6 language families.
The results demonstrate that the extent of model generalization depends on the characteristics of the data set, and does not necessarily rely heavily on the data set size.
arXiv Detail & Related papers (2022-01-05T22:19:10Z) - Spectral Clustering with Variance Information for Group Structure
Estimation in Panel Data [7.712669451925186]
We first conduct a local analysis which reveals that the variances of the individual coefficient estimates contain useful information for the estimation of group structure.
We then propose a method to estimate unobserved groupings for general panel data models that explicitly account for the variance information.
arXiv Detail & Related papers (2022-01-05T19:16:16Z) - Selecting the suitable resampling strategy for imbalanced data
classification regarding dataset properties [62.997667081978825]
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class.
This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples.
Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class.
arXiv Detail & Related papers (2021-12-15T18:56:39Z) - Cohort Bias Adaptation in Aggregated Datasets for Lesion Segmentation [0.8466401378239363]
We propose a generalized affine conditioning framework to learn and account for cohort biases across multi-source datasets.
We show that our cohort bias adaptation method improves performance of the network on pooled datasets.
arXiv Detail & Related papers (2021-08-02T08:32:57Z) - Adversarial Sample Enhanced Domain Adaptation: A Case Study on
Predictive Modeling with Electronic Health Records [57.75125067744978]
We propose a data augmentation method to facilitate domain adaptation.
adversarially generated samples are used during domain adaptation.
Results confirm the effectiveness of our method and the generality on different tasks.
arXiv Detail & Related papers (2021-01-13T03:20:20Z) - Predictive Modeling of ICU Healthcare-Associated Infections from
Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling
Approach [55.41644538483948]
This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units.
The aim is to support decision making addressed at reducing the incidence rate of infections.
arXiv Detail & Related papers (2020-05-07T16:13:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.