Related papers: RegBN: Batch Normalization of Multimodal Data with Regularization

RegBN: Batch Normalization of Multimodal Data with Regularization

URL: http://arxiv.org/abs/2310.00641v2
Date: Sun, 19 Nov 2023 18:02:21 GMT
Title: RegBN: Batch Normalization of Multimodal Data with Regularization
Authors: Morteza Ghahremani and Christian Wachinger
Abstract summary: This paper introduces a novel approach for the normalization of multimodal data, called RegBN. RegBN uses the Frobenius norm as a regularizer term to address the side effects of confounders and underlying dependencies among different data sources. We validate the effectiveness of RegBN on eight databases from five research areas.
Score: 5.293979881130494
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Recent years have witnessed a surge of interest in integrating high-dimensional data captured by multisource sensors, driven by the impressive success of neural networks in the integration of multimodal data. However, the integration of heterogeneous multimodal data poses a significant challenge, as confounding effects and dependencies among such heterogeneous data sources introduce unwanted variability and bias, leading to suboptimal performance of multimodal models. Therefore, it becomes crucial to normalize the low- or high-level features extracted from data modalities before their fusion takes place. This paper introduces a novel approach for the normalization of multimodal data, called RegBN, that incorporates regularization. RegBN uses the Frobenius norm as a regularizer term to address the side effects of confounders and underlying dependencies among different data sources. The proposed method generalizes well across multiple modalities and eliminates the need for learnable parameters, simplifying training and inference. We validate the effectiveness of RegBN on eight databases from five research areas, encompassing diverse modalities such as language, audio, image, video, depth, tabular, and 3D MRI. The proposed method demonstrates broad applicability across different architectures such as multilayer perceptrons, convolutional neural networks, and vision transformers, enabling effective normalization of both low- and high-level features in multimodal neural networks. RegBN is available at \url{https://github.com/mogvision/regbn}.

Related papers

MIFNet: Learning Modality-Invariant Features for Generalizable Multimodal Image Matching [54.740256498985026]
Keypoint detection and description methods often struggle with multimodal data. We propose a modality-invariant feature learning network (MIFNet) to compute modality-invariant features for keypoint descriptions in multimodal image matching.
arXiv Detail & Related papers (2025-01-20T06:56:30Z)
Supervised Batch Normalization [0.08192907805418585]
Batch Normalization (BN) is a widely-used technique in neural networks. We propose Supervised Batch Normalization (SBN), a pioneering approach. We define contexts as modes, categorizing data with similar characteristics.
arXiv Detail & Related papers (2024-05-27T10:30:21Z)
Hybridization of Capsule and LSTM Networks for unsupervised anomaly detection on multivariate data [0.0]
This paper introduces a novel NN architecture which hybridises the Long-Short-Term-Memory (LSTM) and Capsule Networks into a single network. The proposed method uses an unsupervised learning technique to overcome the issues with finding large volumes of labelled training data.
arXiv Detail & Related papers (2022-02-11T10:33:53Z)
Geometric Multimodal Deep Learning with Multi-Scaled Graph Wavelet Convolutional Network [21.06669693699965]
Multimodal data provide information of a natural phenomenon by integrating data from various domains with very different statistical properties. Capturing the intra-modality and cross-modality information of multimodal data is the essential capability of multimodal learning methods. Generalizing deep learning methods to the non-Euclidean domains is an emerging research field.
arXiv Detail & Related papers (2021-11-26T08:41:51Z)
LocalDrop: A Hybrid Regularization for Deep Neural Networks [98.30782118441158]
We propose a new approach for the regularization of neural networks by the local Rademacher complexity called LocalDrop. A new regularization function for both fully-connected networks (FCNs) and convolutional neural networks (CNNs) has been developed based on the proposed upper bound of the local Rademacher complexity.
arXiv Detail & Related papers (2021-03-01T03:10:11Z)
Efficient Construction of Nonlinear Models over Normalized Data [21.531781003420573]
We show how it is possible to decompose in a systematic way both for binary joins and for multi-way joins to construct mixture models. We present algorithms that can conduct the training of the network in a factorized way and offer performance advantages.
arXiv Detail & Related papers (2020-11-23T19:20:03Z)
Deep Multimodal Fusion by Channel Exchanging [87.40768169300898]
This paper proposes a parameter-free multimodal fusion framework that dynamically exchanges channels between sub-networks of different modalities. The validity of such exchanging process is also guaranteed by sharing convolutional filters yet keeping separate BN layers across modalities, which, as an add-on benefit, allows our multimodal architecture to be almost as compact as a unimodal network.
arXiv Detail & Related papers (2020-11-10T09:53:20Z)
A Multi-Semantic Metapath Model for Large Scale Heterogeneous Network Representation Learning [52.83948119677194]
We propose a multi-semantic metapath (MSM) model for large scale heterogeneous representation learning. Specifically, we generate multi-semantic metapath-based random walks to construct the heterogeneous neighborhood to handle the unbalanced distributions. We conduct systematical evaluations for the proposed framework on two challenging datasets: Amazon and Alibaba.
arXiv Detail & Related papers (2020-07-19T22:50:20Z)
Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
MS-Net: Multi-Site Network for Improving Prostate Segmentation with Heterogeneous MRI Data [75.73881040581767]
We propose a novel multi-site network (MS-Net) for improving prostate segmentation by learning robust representations. Our MS-Net improves the performance across all datasets consistently, and outperforms state-of-the-art methods for multi-site learning.
arXiv Detail & Related papers (2020-02-09T14:11:50Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.