How You Split Matters: Data Leakage and Subject Characteristics Studies
in Longitudinal Brain MRI Analysis
- URL: http://arxiv.org/abs/2309.00350v1
- Date: Fri, 1 Sep 2023 09:15:06 GMT
- Title: How You Split Matters: Data Leakage and Subject Characteristics Studies
in Longitudinal Brain MRI Analysis
- Authors: Dewinda Julianensi Rumala
- Abstract summary: Deep learning models have revolutionized the field of medical image analysis, offering significant promise for improved diagnostics and patient care.
However, their performance can be misleadingly optimistic due to a hidden pitfall called 'data leakage'
In this study, we investigate data leakage in 3D medical imaging, specifically using 3D Convolutional Neural Networks (CNNs) for brain MRI analysis.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning models have revolutionized the field of medical image analysis,
offering significant promise for improved diagnostics and patient care.
However, their performance can be misleadingly optimistic due to a hidden
pitfall called 'data leakage'. In this study, we investigate data leakage in 3D
medical imaging, specifically using 3D Convolutional Neural Networks (CNNs) for
brain MRI analysis. While 3D CNNs appear less prone to leakage than 2D
counterparts, improper data splitting during cross-validation (CV) can still
pose issues, especially with longitudinal imaging data containing repeated
scans from the same subject. We explore the impact of different data splitting
strategies on model performance for longitudinal brain MRI analysis and
identify potential data leakage concerns. GradCAM visualization helps reveal
shortcuts in CNN models caused by identity confounding, where the model learns
to identify subjects along with diagnostic features. Our findings, consistent
with prior research, underscore the importance of subject-wise splitting and
evaluating our model further on hold-out data from different subjects to ensure
the integrity and reliability of deep learning models in medical image
analysis.
Related papers
- Probabilistic 3D Correspondence Prediction from Sparse Unsegmented Images [1.2179682412409507]
We propose SPI-CorrNet, a unified model that predicts 3D correspondences from sparse imaging data.
Experiments on the LGE MRI left atrium dataset and Abdomen CT-1K liver datasets demonstrate that our technique enhances the accuracy and robustness of sparse image-driven SSM.
arXiv Detail & Related papers (2024-07-02T03:56:20Z) - AI-based association analysis for medical imaging using latent-space
geometric confounder correction [6.488049546344972]
We introduce an AI method emphasizing semantic feature interpretation and resilience against multiple confounders.
Our approach's merits are tested in three scenarios: extracting confounder-free features from a 2D synthetic dataset; examining the association between prenatal alcohol exposure and children's facial shapes using 3D mesh data.
Results confirm our method effectively reduces confounder influences, establishing less confounded associations.
arXiv Detail & Related papers (2023-10-03T16:09:07Z) - Source-Free Collaborative Domain Adaptation via Multi-Perspective
Feature Enrichment for Functional MRI Analysis [55.03872260158717]
Resting-state MRI functional (rs-fMRI) is increasingly employed in multi-site research to aid neurological disorder analysis.
Many methods have been proposed to reduce fMRI heterogeneity between source and target domains.
But acquiring source data is challenging due to concerns and/or data storage burdens in multi-site studies.
We design a source-free collaborative domain adaptation framework for fMRI analysis, where only a pretrained source model and unlabeled target data are accessible.
arXiv Detail & Related papers (2023-08-24T01:30:18Z) - FAST-AID Brain: Fast and Accurate Segmentation Tool using Artificial
Intelligence Developed for Brain [0.8376091455761259]
A novel deep learning method is proposed for fast and accurate segmentation of the human brain into 132 regions.
The proposed model uses an efficient U-Net-like network and benefits from the intersection points of different views and hierarchical relations.
The proposed method can be applied to brain MRI data including skull or any other artifacts without preprocessing the images or a drop in performance.
arXiv Detail & Related papers (2022-08-30T16:06:07Z) - Feature robustness and sex differences in medical imaging: a case study
in MRI-based Alzheimer's disease detection [1.7616042687330637]
We compare two classification schemes on the ADNI MRI dataset.
We do not find a strong dependence of model performance for male and female test subjects on the sex composition of the training dataset.
arXiv Detail & Related papers (2022-04-04T17:37:54Z) - Data and Physics Driven Learning Models for Fast MRI -- Fundamentals and
Methodologies from CNN, GAN to Attention and Transformers [72.047680167969]
This article aims to introduce the deep learning based data driven techniques for fast MRI including convolutional neural network and generative adversarial network based methods.
We will detail the research in coupling physics and data driven models for MRI acceleration.
Finally, we will demonstrate through a few clinical applications, explain the importance of data harmonisation and explainable models for such fast MRI techniques in multicentre and multi-scanner studies.
arXiv Detail & Related papers (2022-04-01T22:48:08Z) - 3-Dimensional Deep Learning with Spatial Erasing for Unsupervised
Anomaly Segmentation in Brain MRI [55.97060983868787]
We investigate whether using increased spatial context by using MRI volumes combined with spatial erasing leads to improved unsupervised anomaly segmentation performance.
We compare 2D variational autoencoder (VAE) to their 3D counterpart, propose 3D input erasing, and systemically study the impact of the data set size on the performance.
Our best performing 3D VAE with input erasing leads to an average DICE score of 31.40% compared to 25.76% for the 2D VAE.
arXiv Detail & Related papers (2021-09-14T09:17:27Z) - Automated Model Design and Benchmarking of 3D Deep Learning Models for
COVID-19 Detection with Chest CT Scans [72.04652116817238]
We propose a differentiable neural architecture search (DNAS) framework to automatically search for the 3D DL models for 3D chest CT scans classification.
We also exploit the Class Activation Mapping (CAM) technique on our models to provide the interpretability of the results.
arXiv Detail & Related papers (2021-01-14T03:45:01Z) - Fader Networks for domain adaptation on fMRI: ABIDE-II study [68.5481471934606]
We use 3D convolutional autoencoders to build the domain irrelevant latent space image representation and demonstrate this method to outperform existing approaches on ABIDE data.
arXiv Detail & Related papers (2020-10-14T16:50:50Z) - Brain Tumor Segmentation using 3D-CNNs with Uncertainty Estimation [0.0]
This work proposes a 3D encoder-decoder architecture, based on V-Net citevnet which is trained with patching techniques to reduce memory consumption.
Uncertainty maps can provide extra information to expert neurologists, useful for detecting when the model is not confident on the provided segmentation.
arXiv Detail & Related papers (2020-09-24T10:50:12Z) - Interpretation of 3D CNNs for Brain MRI Data Classification [56.895060189929055]
We extend the previous findings in gender differences from diffusion-tensor imaging on T1 brain MRI scans.
We provide the voxel-wise 3D CNN interpretation comparing the results of three interpretation methods.
arXiv Detail & Related papers (2020-06-20T17:56:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.