Synthetic Power Analyses: Empirical Evaluation and Application to
Cognitive Neuroimaging
- URL: http://arxiv.org/abs/2210.05835v1
- Date: Tue, 11 Oct 2022 23:33:32 GMT
- Title: Synthetic Power Analyses: Empirical Evaluation and Application to
Cognitive Neuroimaging
- Authors: Peiye Zhuang, Bliss Chapman, Ran Li, Oluwasanmi Koyejo
- Abstract summary: We propose a framework for estimating statistical power at various sample sizes.
We empirically explore the performance of synthetic power analysis for sample size selection in cognitive neuroscience experiments.
Our empirical results suggest that synthetic power analysis could be a low-cost alternative to pilot data collection.
- Score: 14.57108653193695
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the experimental sciences, statistical power analyses are often used
before data collection to determine the required sample size. However,
traditional power analyses can be costly when data are difficult or expensive
to collect. We propose synthetic power analyses; a framework for estimating
statistical power at various sample sizes, and empirically explore the
performance of synthetic power analysis for sample size selection in cognitive
neuroscience experiments. To this end, brain imaging data is synthesized using
an implicit generative model conditioned on observed cognitive processes.
Further, we propose a simple procedure to modify the statistical tests which
result in conservative statistics. Our empirical results suggest that synthetic
power analysis could be a low-cost alternative to pilot data collection when
the proposed experiments share cognitive processes with previously conducted
experiments.
Related papers
- Statistical Test for Generated Hypotheses by Diffusion Models [21.378672594642616]
We consider a medical diagnostic task using generated images by diffusion models, and propose a statistical test to quantify its reliability.
Using the proposed method, the statistical reliability of medical image diagnostic results can be quantified in the form of a p-value, allowing for decision-making with a controlled error rate.
arXiv Detail & Related papers (2024-02-19T02:32:45Z) - The Real Deal Behind the Artificial Appeal: Inferential Utility of Tabular Synthetic Data [40.165159490379146]
We show that the rate of false-positive findings (type 1 error) will be unacceptably high, even when the estimates are unbiased.
Despite the use of a previously proposed correction factor, this problem persists for deep generative models.
arXiv Detail & Related papers (2023-12-13T02:04:41Z) - Synthetic data generation for a longitudinal cohort study -- Evaluation,
method extension and reproduction of published data analysis results [0.32593385688760446]
In the health sector, access to individual-level data is often challenging due to privacy concerns.
A promising alternative is the generation of fully synthetic data.
In this study, we use a state-of-the-art synthetic data generation method.
arXiv Detail & Related papers (2023-05-12T13:13:55Z) - Revisiting the Evaluation of Image Synthesis with GANs [55.72247435112475]
This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models.
In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set.
arXiv Detail & Related papers (2023-04-04T17:54:32Z) - Online simulator-based experimental design for cognitive model selection [74.76661199843284]
We propose BOSMOS: an approach to experimental design that can select between computational models without tractable likelihoods.
In simulated experiments, we demonstrate that the proposed BOSMOS technique can accurately select models in up to 2 orders of magnitude less time than existing LFI alternatives.
arXiv Detail & Related papers (2023-03-03T21:41:01Z) - Trustworthiness of Laser-Induced Breakdown Spectroscopy Predictions via
Simulation-based Synthetic Data Augmentation and Multitask Learning [4.633997895806144]
We consider quantitative analyses of spectral data using laser-induced breakdown spectroscopy.
We address the small size of training data available, and the validation of the predictions during inference on unknown data.
arXiv Detail & Related papers (2022-10-07T18:00:09Z) - Investigating Bias with a Synthetic Data Generator: Empirical Evidence
and Philosophical Interpretation [66.64736150040093]
Machine learning applications are becoming increasingly pervasive in our society.
Risk is that they will systematically spread the bias embedded in data.
We propose to analyze biases by introducing a framework for generating synthetic data with specific types of bias and their combinations.
arXiv Detail & Related papers (2022-09-13T11:18:50Z) - BeCAPTCHA-Type: Biometric Keystroke Data Generation for Improved Bot
Detection [63.447493500066045]
This work proposes a data driven learning model for the synthesis of keystroke biometric data.
The proposed method is compared with two statistical approaches based on Universal and User-dependent models.
Our experimental framework considers a dataset with 136 million keystroke events from 168 thousand subjects.
arXiv Detail & Related papers (2022-07-27T09:26:15Z) - A Kernelised Stein Statistic for Assessing Implicit Generative Models [10.616967871198689]
We propose a principled procedure to assess the quality of a synthetic data generator.
The sample size from the synthetic data generator can be as large as desired, while the size of the observed data, which the generator aims to emulate is fixed.
arXiv Detail & Related papers (2022-05-31T23:40:21Z) - With Little Power Comes Great Responsibility [54.96675741328462]
Underpowered experiments make it more difficult to discern the difference between statistical noise and meaningful model improvements.
Small test sets mean that most attempted comparisons to state of the art models will not be adequately powered.
For machine translation, we find that typical test sets of 2000 sentences have approximately 75% power to detect differences of 1 BLEU point.
arXiv Detail & Related papers (2020-10-13T18:00:02Z) - Modeling Shared Responses in Neuroimaging Studies through MultiView ICA [94.31804763196116]
Group studies involving large cohorts of subjects are important to draw general conclusions about brain functional organization.
We propose a novel MultiView Independent Component Analysis model for group studies, where data from each subject are modeled as a linear combination of shared independent sources plus noise.
We demonstrate the usefulness of our approach first on fMRI data, where our model demonstrates improved sensitivity in identifying common sources among subjects.
arXiv Detail & Related papers (2020-06-11T17:29:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.