RegBN: Batch Normalization of Multimodal Data with Regularization
- URL: http://arxiv.org/abs/2310.00641v2
- Date: Sun, 19 Nov 2023 18:02:21 GMT
- Title: RegBN: Batch Normalization of Multimodal Data with Regularization
- Authors: Morteza Ghahremani and Christian Wachinger
- Abstract summary: This paper introduces a novel approach for the normalization of multimodal data, called RegBN.
RegBN uses the Frobenius norm as a regularizer term to address the side effects of confounders and underlying dependencies among different data sources.
We validate the effectiveness of RegBN on eight databases from five research areas.
- Score: 5.293979881130494
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recent years have witnessed a surge of interest in integrating
high-dimensional data captured by multisource sensors, driven by the impressive
success of neural networks in the integration of multimodal data. However, the
integration of heterogeneous multimodal data poses a significant challenge, as
confounding effects and dependencies among such heterogeneous data sources
introduce unwanted variability and bias, leading to suboptimal performance of
multimodal models. Therefore, it becomes crucial to normalize the low- or
high-level features extracted from data modalities before their fusion takes
place. This paper introduces a novel approach for the normalization of
multimodal data, called RegBN, that incorporates regularization. RegBN uses the
Frobenius norm as a regularizer term to address the side effects of confounders
and underlying dependencies among different data sources. The proposed method
generalizes well across multiple modalities and eliminates the need for
learnable parameters, simplifying training and inference. We validate the
effectiveness of RegBN on eight databases from five research areas,
encompassing diverse modalities such as language, audio, image, video, depth,
tabular, and 3D MRI. The proposed method demonstrates broad applicability
across different architectures such as multilayer perceptrons, convolutional
neural networks, and vision transformers, enabling effective normalization of
both low- and high-level features in multimodal neural networks. RegBN is
available at \url{https://github.com/mogvision/regbn}.
Related papers
- Supervised Batch Normalization [0.08192907805418585]
Batch Normalization (BN) is a widely-used technique in neural networks.
We propose Supervised Batch Normalization (SBN), a pioneering approach.
We define contexts as modes, categorizing data with similar characteristics.
arXiv Detail & Related papers (2024-05-27T10:30:21Z) - Hybridization of Capsule and LSTM Networks for unsupervised anomaly
detection on multivariate data [0.0]
This paper introduces a novel NN architecture which hybridises the Long-Short-Term-Memory (LSTM) and Capsule Networks into a single network.
The proposed method uses an unsupervised learning technique to overcome the issues with finding large volumes of labelled training data.
arXiv Detail & Related papers (2022-02-11T10:33:53Z) - Geometric Multimodal Deep Learning with Multi-Scaled Graph Wavelet
Convolutional Network [21.06669693699965]
Multimodal data provide information of a natural phenomenon by integrating data from various domains with very different statistical properties.
Capturing the intra-modality and cross-modality information of multimodal data is the essential capability of multimodal learning methods.
Generalizing deep learning methods to the non-Euclidean domains is an emerging research field.
arXiv Detail & Related papers (2021-11-26T08:41:51Z) - LocalDrop: A Hybrid Regularization for Deep Neural Networks [98.30782118441158]
We propose a new approach for the regularization of neural networks by the local Rademacher complexity called LocalDrop.
A new regularization function for both fully-connected networks (FCNs) and convolutional neural networks (CNNs) has been developed based on the proposed upper bound of the local Rademacher complexity.
arXiv Detail & Related papers (2021-03-01T03:10:11Z) - Efficient Construction of Nonlinear Models over Normalized Data [21.531781003420573]
We show how it is possible to decompose in a systematic way both for binary joins and for multi-way joins to construct mixture models.
We present algorithms that can conduct the training of the network in a factorized way and offer performance advantages.
arXiv Detail & Related papers (2020-11-23T19:20:03Z) - Deep Multimodal Fusion by Channel Exchanging [87.40768169300898]
This paper proposes a parameter-free multimodal fusion framework that dynamically exchanges channels between sub-networks of different modalities.
The validity of such exchanging process is also guaranteed by sharing convolutional filters yet keeping separate BN layers across modalities, which, as an add-on benefit, allows our multimodal architecture to be almost as compact as a unimodal network.
arXiv Detail & Related papers (2020-11-10T09:53:20Z) - A Multi-Semantic Metapath Model for Large Scale Heterogeneous Network
Representation Learning [52.83948119677194]
We propose a multi-semantic metapath (MSM) model for large scale heterogeneous representation learning.
Specifically, we generate multi-semantic metapath-based random walks to construct the heterogeneous neighborhood to handle the unbalanced distributions.
We conduct systematical evaluations for the proposed framework on two challenging datasets: Amazon and Alibaba.
arXiv Detail & Related papers (2020-07-19T22:50:20Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z) - MS-Net: Multi-Site Network for Improving Prostate Segmentation with
Heterogeneous MRI Data [75.73881040581767]
We propose a novel multi-site network (MS-Net) for improving prostate segmentation by learning robust representations.
Our MS-Net improves the performance across all datasets consistently, and outperforms state-of-the-art methods for multi-site learning.
arXiv Detail & Related papers (2020-02-09T14:11:50Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.