Assessing Robustness of EEG Representations under Data-shifts via Latent
Space and Uncertainty Analysis
- URL: http://arxiv.org/abs/2209.11233v1
- Date: Thu, 22 Sep 2022 19:26:09 GMT
- Title: Assessing Robustness of EEG Representations under Data-shifts via Latent
Space and Uncertainty Analysis
- Authors: Neeraj Wagh, Jionghao Wei, Samarth Rawal, Brent M. Berry, Yogatheesan
Varatharajah
- Abstract summary: We develop diagnostic measures to detect potential pitfalls during deployment without assuming access to external data.
Specifically, we focus on modeling realistic data shifts in electrophysiological signals (EEGs) via data transforms.
We conduct experiments on multiple EEG feature encoders and two clinically relevant downstream tasks using publicly available large-scale clinical EEGs.
- Score: 0.29998889086656577
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The recent availability of large datasets in bio-medicine has inspired the
development of representation learning methods for multiple healthcare
applications. Despite advances in predictive performance, the clinical utility
of such methods is limited when exposed to real-world data. Here we develop
model diagnostic measures to detect potential pitfalls during deployment
without assuming access to external data. Specifically, we focus on modeling
realistic data shifts in electrophysiological signals (EEGs) via data
transforms, and extend the conventional task-based evaluations with analyses of
a) model's latent space and b) predictive uncertainty, under these transforms.
We conduct experiments on multiple EEG feature encoders and two clinically
relevant downstream tasks using publicly available large-scale clinical EEGs.
Within this experimental setting, our results suggest that measures of latent
space integrity and model uncertainty under the proposed data shifts may help
anticipate performance degradation during deployment.
Related papers
- On the challenges of detecting MCI using EEG in the wild [6.505818939553856]
Recent studies have shown promising results in the detection of Mild Cognitive Impairment (MCI) using Electroencephalogram (EEG) data.
We investigate the potential limitations and challenges in developing a robust MCI detection method using two contrasting datasets.
arXiv Detail & Related papers (2025-01-15T15:20:11Z) - Representation Learning of Lab Values via Masked AutoEncoder [2.785172582119726]
We propose Lab-MAE, a transformer-based masked autoencoder framework for imputation of sequential lab values.
Empirical evaluation on the MIMIC-IV dataset demonstrates that Lab-MAE significantly outperforms the state-of-the-art baselines.
Lab-MAE achieves equitable performance across demographic groups of patients, advancing fairness in clinical predictions.
arXiv Detail & Related papers (2025-01-05T20:26:49Z) - Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models [69.06149482021071]
We propose a novel EHR data generation model called EHRPD.
It is a diffusion-based model designed to predict the next visit based on the current one while also incorporating time interval estimation.
We conduct experiments on two public datasets and evaluate EHRPD from fidelity, privacy, and utility perspectives.
arXiv Detail & Related papers (2024-06-20T02:20:23Z) - Domain-invariant Clinical Representation Learning by Bridging Data Distribution Shift across EMR Datasets [28.59271580918754]
An effective prognostic model could assist physicians in making accurate diagnoses and designing personalized treatment plans.
limited data collection, insufficient clinical experience, and privacy and ethical concerns restrict data availability.
We present a domain-invariant representation learning method that constructs a transition model between source and target datasets.
arXiv Detail & Related papers (2023-10-11T18:32:21Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation.
GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z) - Data augmentation for learning predictive models on EEG: a systematic
comparison [79.84079335042456]
deep learning for electroencephalography (EEG) classification tasks has been rapidly growing in the last years.
Deep learning for EEG classification tasks has been limited by the relatively small size of EEG datasets.
Data augmentation has been a key ingredient to obtain state-of-the-art performances across applications such as computer vision or speech.
arXiv Detail & Related papers (2022-06-29T09:18:15Z) - A Novel TSK Fuzzy System Incorporating Multi-view Collaborative Transfer
Learning for Personalized Epileptic EEG Detection [20.11589208667256]
We propose a TSK fuzzy system-based epilepsy detection algorithm that integrates multi-view collaborative transfer learning.
The proposed method has the potential to detect epileptic EEG signals effectively.
arXiv Detail & Related papers (2021-11-11T12:15:55Z) - Uncovering the structure of clinical EEG signals with self-supervised
learning [64.4754948595556]
Supervised learning paradigms are often limited by the amount of labeled data that is available.
This phenomenon is particularly problematic in clinically-relevant data, such as electroencephalography (EEG)
By extracting information from unlabeled data, it might be possible to reach competitive performance with deep neural networks.
arXiv Detail & Related papers (2020-07-31T14:34:47Z) - Trajectories, bifurcations and pseudotime in large clinical datasets:
applications to myocardial infarction and diabetes data [94.37521840642141]
We suggest a semi-supervised methodology for the analysis of large clinical datasets, characterized by mixed data types and missing values.
The methodology is based on application of elastic principal graphs which can address simultaneously the tasks of dimensionality reduction, data visualization, clustering, feature selection and quantifying the geodesic distances (pseudotime) in partially ordered sequences of observations.
arXiv Detail & Related papers (2020-07-07T21:04:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.