A Penalty Approach for Normalizing Feature Distributions to Build
Confounder-Free Models
- URL: http://arxiv.org/abs/2207.04607v1
- Date: Mon, 11 Jul 2022 04:02:12 GMT
- Title: A Penalty Approach for Normalizing Feature Distributions to Build
Confounder-Free Models
- Authors: Anthony Vento and Qingyu Zhao and Robert Paul and Kilian M. Pohl and
Ehsan Adeli
- Abstract summary: MetaData Normalization (MDN) estimates the linear relationship between the metadata and each feature based on a non-trainable closed-form solution.
We extend the MDN method by applying a Penalty approach (referred to as PDMN)
We show improvement in model accuracy and greater independence from confounders using PMDN over MDN in a synthetic experiment and a multi-label, multi-site dataset of magnetic resonance images (MRIs)
- Score: 11.818509522227565
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Translating machine learning algorithms into clinical applications requires
addressing challenges related to interpretability, such as accounting for the
effect of confounding variables (or metadata). Confounding variables affect the
relationship between input training data and target outputs. When we train a
model on such data, confounding variables will bias the distribution of the
learned features. A recent promising solution, MetaData Normalization (MDN),
estimates the linear relationship between the metadata and each feature based
on a non-trainable closed-form solution. However, this estimation is confined
by the sample size of a mini-batch and thereby may cause the approach to be
unstable during training. In this paper, we extend the MDN method by applying a
Penalty approach (referred to as PDMN). We cast the problem into a bi-level
nested optimization problem. We then approximate this optimization problem
using a penalty method so that the linear parameters within the MDN layer are
trainable and learned on all samples. This enables PMDN to be plugged into any
architectures, even those unfit to run batch-level operations, such as
transformers and recurrent models. We show improvement in model accuracy and
greater independence from confounders using PMDN over MDN in a synthetic
experiment and a multi-label, multi-site dataset of magnetic resonance images
(MRIs).
Related papers
- The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Convolutional Monge Mapping Normalization for learning on sleep data [63.22081662149488]
We propose a new method called Convolutional Monge Mapping Normalization (CMMN)
CMMN consists in filtering the signals in order to adapt their power spectrum density (PSD) to a Wasserstein barycenter estimated on training data.
Numerical experiments on sleep EEG data show that CMMN leads to significant and consistent performance gains independent from the neural network architecture.
arXiv Detail & Related papers (2023-05-30T08:24:01Z) - Unite and Conquer: Plug & Play Multi-Modal Synthesis using Diffusion
Models [54.1843419649895]
We propose a solution based on denoising diffusion probabilistic models (DDPMs)
Our motivation for choosing diffusion models over other generative models comes from the flexible internal structure of diffusion models.
Our method can unite multiple diffusion models trained on multiple sub-tasks and conquer the combined task.
arXiv Detail & Related papers (2022-12-01T18:59:55Z) - ScoreMix: A Scalable Augmentation Strategy for Training GANs with
Limited Data [93.06336507035486]
Generative Adversarial Networks (GANs) typically suffer from overfitting when limited training data is available.
We present ScoreMix, a novel and scalable data augmentation approach for various image synthesis tasks.
arXiv Detail & Related papers (2022-10-27T02:55:15Z) - Meta Input: How to Leverage Off-the-Shelf Deep Neural Networks [29.975937981538664]
We introduce a novel approach that allows end-users to exploit pretrained DNN models in their own testing environment without modifying the models.
We present a textitmeta input which is an additional input transforming the distribution of testing data to be aligned with that of training data.
As a result, end-users can exploit well-trained models in their own testing environment which can differ from the training environment.
arXiv Detail & Related papers (2022-10-21T02:11:38Z) - Cascaded Multi-Modal Mixing Transformers for Alzheimer's Disease
Classification with Incomplete Data [8.536869574065195]
Multi-Modal Mixing Transformer (3MAT) is a disease classification transformer that not only leverages multi-modal data but also handles missing data scenarios.
We propose a novel modality dropout mechanism to ensure an unprecedented level of modality independence and robustness to handle missing data scenarios.
arXiv Detail & Related papers (2022-10-01T11:31:02Z) - Mixing Deep Learning and Multiple Criteria Optimization: An Application
to Distributed Learning with Multiple Datasets [0.0]
Training phase is the most important stage during the machine learning process.
We develop a multiple criteria optimization model in which each criterion measures the distance between the output associated with a specific input and its label.
We propose a scalarization approach to implement this model and numerical experiments in digit classification using MNIST data.
arXiv Detail & Related papers (2021-12-02T16:00:44Z) - GOALS: Gradient-Only Approximations for Line Searches Towards Robust and
Consistent Training of Deep Neural Networks [0.0]
Mini-batch sub-sampling (MBSS) is favored in deep neural network training to reduce the computational cost.
We propose a gradient-only approximation line search (GOALS) with strong convergence characteristics with defined optimality criterion.
arXiv Detail & Related papers (2021-05-23T11:21:01Z) - Metadata Normalization [54.43363251520749]
Batch Normalization (BN) normalizes feature distributions by standardizing with batch statistics.
BN does not correct the influence on features from extraneous variables or multiple distributions.
We introduce the Metadata Normalization layer, a new batch-level operation which can be used end-to-end within the training framework.
arXiv Detail & Related papers (2021-04-19T05:10:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.