Cell Variational Information Bottleneck Network
- URL: http://arxiv.org/abs/2403.15082v3
- Date: Fri, 29 Mar 2024 07:20:42 GMT
- Title: Cell Variational Information Bottleneck Network
- Authors: Zhonghua Zhai, Chen Ju, Jinsong Lan, Shuai Xiao,
- Abstract summary: We propose a convolutional neural network using information bottleneck mechanism, which can be combined with the latest feedforward network architecture.
Cell Variational Information Bottleneck Network is constructed by stacking VIB cells, which generate feature maps with uncertainty.
In a more complex representation learning task, face recognition, our network structure has also achieved very competitive results.
- Score: 6.164295534465283
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we propose Cell Variational Information Bottleneck Network (cellVIB), a convolutional neural network using information bottleneck mechanism, which can be combined with the latest feedforward network architecture in an end-to-end training method. Our Cell Variational Information Bottleneck Network is constructed by stacking VIB cells, which generate feature maps with uncertainty. As layers going deeper, the regularization effect will gradually increase, instead of directly adding excessive regular constraints to the output layer of the model as in Deep VIB. Under each VIB cell, the feedforward process learns an independent mean term and an standard deviation term, and predicts the Gaussian distribution based on them. The feedback process is based on reparameterization trick for effective training. This work performs an extensive analysis on MNIST dataset to verify the effectiveness of each VIB cells, and provides an insightful analysis on how the VIB cells affect mutual information. Experiments conducted on CIFAR-10 also prove that our cellVIB is robust against noisy labels during training and against corrupted images during testing. Then, we validate our method on PACS dataset, whose results show that the VIB cells can significantly improve the generalization performance of the basic model. Finally, in a more complex representation learning task, face recognition, our network structure has also achieved very competitive results.
Related papers
- IB-AdCSCNet:Adaptive Convolutional Sparse Coding Network Driven by Information Bottleneck [4.523653503622693]
We introduce IB-AdCSCNet, a deep learning model grounded in information bottleneck theory.
IB-AdCSCNet seamlessly integrates the information bottleneck trade-off strategy into deep networks.
Experimental results on CIFAR-10 and CIFAR-100 datasets demonstrate that IB-AdCSCNet not only matches the performance of deep residual convolutional networks but also outperforms them when handling corrupted data.
arXiv Detail & Related papers (2024-05-23T05:35:57Z) - Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions.
We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training.
Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z) - PhagoStat a scalable and interpretable end to end framework for
efficient quantification of cell phagocytosis in neurodegenerative disease
studies [0.0]
We introduce an end-to-end, scalable, and versatile real-time framework for quantifying and analyzing phagocytic activity.
Our proposed pipeline is able to process large data-sets and includes a data quality verification module.
We apply our pipeline to analyze microglial cell phagocytosis in FTD and obtain statistically reliable results.
arXiv Detail & Related papers (2023-04-26T18:10:35Z) - Contrastive variational information bottleneck for aspect-based
sentiment analysis [36.83876224466177]
We propose to reduce spurious correlations for aspect-based sentiment analysis (ABSA) via a novel Contrastive Variational Information Bottleneck framework (called CVIB)
The proposed CVIB framework is composed of an original network and a self-pruned network, and these two networks are optimized simultaneously via contrastive learning.
Our approach achieves better performance than the strong competitors in terms of overall prediction performance, robustness, and generalization.
arXiv Detail & Related papers (2023-03-06T02:52:37Z) - Brain Network Transformer [13.239896897835191]
We study Transformer-based models for brain network analysis.
Driven by the unique properties of data, we model brain networks as graphs with nodes of fixed size and order.
We re-standardize the evaluation pipeline on the only one publicly available large-scale brain network dataset of ABIDE.
arXiv Detail & Related papers (2022-10-13T02:30:06Z) - Gated Information Bottleneck for Generalization in Sequential
Environments [13.795129636387623]
Deep neural networks suffer from poor generalization to unseen environments when the underlying data distribution is different from that in the training set.
We propose a new neural network-based IB approach, termed gated information bottleneck (GIB)
We empirically demonstrate the superiority of GIB over other popular neural network-based IB approaches in adversarial robustness and out-of-distribution detection.
arXiv Detail & Related papers (2021-10-12T14:58:38Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z) - Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks.
We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator.
To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z) - On Robustness and Transferability of Convolutional Neural Networks [147.71743081671508]
Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts.
We study the interplay between out-of-distribution and transfer performance of modern image classification CNNs for the first time.
We find that increasing both the training set and model sizes significantly improve the distributional shift robustness.
arXiv Detail & Related papers (2020-07-16T18:39:04Z) - The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network
Architectures [179.66117325866585]
We investigate a design space that is usually overlooked, i.e. adjusting the channel configurations of predefined networks.
We find that this adjustment can be achieved by shrinking widened baseline networks and leads to superior performance.
Experiments are conducted on various networks and datasets for image classification, visual tracking and image restoration.
arXiv Detail & Related papers (2020-06-29T17:59:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.