Related papers: IB-AdCSCNet:Adaptive Convolutional Sparse Coding Network Driven by Information Bottleneck

IB-AdCSCNet:Adaptive Convolutional Sparse Coding Network Driven by Information Bottleneck

URL: http://arxiv.org/abs/2405.14192v1
Date: Thu, 23 May 2024 05:35:57 GMT
Title: IB-AdCSCNet:Adaptive Convolutional Sparse Coding Network Driven by Information Bottleneck
Authors: He Zou, Meng'en Qin, Yu Song, Xiaohui Yang,
Abstract summary: We introduce IB-AdCSCNet, a deep learning model grounded in information bottleneck theory. IB-AdCSCNet seamlessly integrates the information bottleneck trade-off strategy into deep networks. Experimental results on CIFAR-10 and CIFAR-100 datasets demonstrate that IB-AdCSCNet not only matches the performance of deep residual convolutional networks but also outperforms them when handling corrupted data.
Score: 4.523653503622693
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In the realm of neural network models, the perpetual challenge remains in retaining task-relevant information while effectively discarding redundant data during propagation. In this paper, we introduce IB-AdCSCNet, a deep learning model grounded in information bottleneck theory. IB-AdCSCNet seamlessly integrates the information bottleneck trade-off strategy into deep networks by dynamically adjusting the trade-off hyperparameter $\lambda$ through gradient descent, updating it within the FISTA(Fast Iterative Shrinkage-Thresholding Algorithm ) framework. By optimizing the compressive excitation loss function induced by the information bottleneck principle, IB-AdCSCNet achieves an optimal balance between compression and fitting at a global level, approximating the globally optimal representation feature. This information bottleneck trade-off strategy driven by downstream tasks not only helps to learn effective features of the data, but also improves the generalization of the model. This study's contribution lies in presenting a model with consistent performance and offering a fresh perspective on merging deep learning with sparse representation theory, grounded in the information bottleneck concept. Experimental results on CIFAR-10 and CIFAR-100 datasets demonstrate that IB-AdCSCNet not only matches the performance of deep residual convolutional networks but also outperforms them when handling corrupted data. Through the inference of the IB trade-off, the model's robustness is notably enhanced.

Related papers

Global Convergence and Rich Feature Learning in $L$-Layer Infinite-Width Neural Networks under $μ$P Parametrization [66.03821840425539]
In this paper, we investigate the training dynamics of $L$-layer neural networks using the tensor gradient program (SGD) framework. We show that SGD enables these networks to learn linearly independent features that substantially deviate from their initial values. This rich feature space captures relevant data information and ensures that any convergent point of the training process is a global minimum.
arXiv Detail & Related papers (2025-03-12T17:33:13Z)
BiDense: Binarization for Dense Prediction [62.70804353158387]
BiDense is a generalized binary neural network (BNN) designed for efficient and accurate dense prediction tasks. BiDense incorporates two key techniques: the Distribution-adaptive Binarizer (DAB) and the Channel-adaptive Full-precision Bypass (CFB)
arXiv Detail & Related papers (2024-11-15T16:46:04Z)
An Enhanced Encoder-Decoder Network Architecture for Reducing Information Loss in Image Semantic Segmentation [6.596361762662328]
We introduce an innovative encoder-decoder network structure enhanced with residual connections. Our approach employs a multi-residual connection strategy designed to preserve the intricate details across various image scales more effectively. To enhance the convergence rate of network training and mitigate sample imbalance issues, we have devised a modified cross-entropy loss function.
arXiv Detail & Related papers (2024-05-26T05:15:53Z)
Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity. Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z)
Analysis and Optimization of Wireless Federated Learning with Data Heterogeneity [72.85248553787538]
This paper focuses on performance analysis and optimization for wireless FL, considering data heterogeneity, combined with wireless resource allocation. We formulate the loss function minimization problem, under constraints on long-term energy consumption and latency, and jointly optimize client scheduling, resource allocation, and the number of local training epochs (CRE) Experiments on real-world datasets demonstrate that the proposed algorithm outperforms other benchmarks in terms of the learning accuracy and energy consumption.
arXiv Detail & Related papers (2023-08-04T04:18:01Z)
CAFE: Learning to Condense Dataset by Aligning Features [72.99394941348757]
We propose a novel scheme to Condense dataset by Aligning FEatures (CAFE) At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales. We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art.
arXiv Detail & Related papers (2022-03-03T05:58:49Z)
Distribution-sensitive Information Retention for Accurate Binary Neural Network [49.971345958676196]
We present a novel Distribution-sensitive Information Retention Network (DIR-Net) to retain the information of the forward activations and backward gradients. Our DIR-Net consistently outperforms the SOTA binarization approaches under mainstream and compact architectures. We conduct our DIR-Net on real-world resource-limited devices which achieves 11.1 times storage saving and 5.4 times speedup.
arXiv Detail & Related papers (2021-09-25T10:59:39Z)
Mitigating severe over-parameterization in deep convolutional neural networks through forced feature abstraction and compression with an entropy-based heuristic [7.503338065129185]
We propose an Entropy-Based Convolutional Layer Estimation (EBCLE) which is robust and simple. We present empirical evidence to emphasize the relative effectiveness of broader, yet shallower models trained using the EBCLE.
arXiv Detail & Related papers (2021-06-27T10:34:39Z)
Drill the Cork of Information Bottleneck by Inputting the Most Important Data [28.32769151293851]
How to efficiently train deep neural networks remains to be solved. The information bottleneck (IB) theory claims that the optimization process consists of an initial fitting phase and the following compression phase. We show that the fitting phase depicted in the IB theory will be boosted with a high signal-to-noise ratio if the typicality sampling is appropriately adopted.
arXiv Detail & Related papers (2021-05-15T09:20:36Z)
FG-Net: Fast Large-Scale LiDAR Point CloudsUnderstanding Network Leveraging CorrelatedFeature Mining and Geometric-Aware Modelling [15.059508985699575]
FG-Net is a general deep learning framework for large-scale point clouds understanding without voxelizations. We propose a deep convolutional neural network leveraging correlated feature mining and deformable convolution based geometric-aware modelling. Our approaches outperform state-of-the-art approaches in terms of accuracy and efficiency.
arXiv Detail & Related papers (2020-12-17T08:20:09Z)
Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.