IB-AdCSCNet:Adaptive Convolutional Sparse Coding Network Driven by Information Bottleneck
- URL: http://arxiv.org/abs/2405.14192v1
- Date: Thu, 23 May 2024 05:35:57 GMT
- Title: IB-AdCSCNet:Adaptive Convolutional Sparse Coding Network Driven by Information Bottleneck
- Authors: He Zou, Meng'en Qin, Yu Song, Xiaohui Yang,
- Abstract summary: We introduce IB-AdCSCNet, a deep learning model grounded in information bottleneck theory.
IB-AdCSCNet seamlessly integrates the information bottleneck trade-off strategy into deep networks.
Experimental results on CIFAR-10 and CIFAR-100 datasets demonstrate that IB-AdCSCNet not only matches the performance of deep residual convolutional networks but also outperforms them when handling corrupted data.
- Score: 4.523653503622693
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the realm of neural network models, the perpetual challenge remains in retaining task-relevant information while effectively discarding redundant data during propagation. In this paper, we introduce IB-AdCSCNet, a deep learning model grounded in information bottleneck theory. IB-AdCSCNet seamlessly integrates the information bottleneck trade-off strategy into deep networks by dynamically adjusting the trade-off hyperparameter $\lambda$ through gradient descent, updating it within the FISTA(Fast Iterative Shrinkage-Thresholding Algorithm ) framework. By optimizing the compressive excitation loss function induced by the information bottleneck principle, IB-AdCSCNet achieves an optimal balance between compression and fitting at a global level, approximating the globally optimal representation feature. This information bottleneck trade-off strategy driven by downstream tasks not only helps to learn effective features of the data, but also improves the generalization of the model. This study's contribution lies in presenting a model with consistent performance and offering a fresh perspective on merging deep learning with sparse representation theory, grounded in the information bottleneck concept. Experimental results on CIFAR-10 and CIFAR-100 datasets demonstrate that IB-AdCSCNet not only matches the performance of deep residual convolutional networks but also outperforms them when handling corrupted data. Through the inference of the IB trade-off, the model's robustness is notably enhanced.
Related papers
- ANCRe: Adaptive Neural Connection Reassignment for Efficient Depth Scaling [57.91760520589592]
Scaling network depth has been a central driver behind the success of modern foundation models.<n>This paper revisits the default mechanism for deepening neural networks, namely residual connections.<n>We introduce adaptive neural connection reassignment (ANCRe), a principled and lightweight framework that parameterizes and learns residual connectivities from the data.
arXiv Detail & Related papers (2026-02-09T18:54:18Z) - Glocal Information Bottleneck for Time Series Imputation [70.41814118117311]
Time Series Imputation aims to recover missing values in temporal data.<n>Existing models typically optimize the point-wise reconstruction loss, focusing on recovering numerical values (local information)<n>We propose a new training paradigm, Glocal Information Bottleneck (Glocal-IB)
arXiv Detail & Related papers (2025-10-06T15:24:44Z) - Widening the Network Mitigates the Impact of Data Heterogeneity on FedAvg [6.185573921868495]
Federated learning (FL) enables decentralized clients to train a model collaboratively without sharing local data.<n>We prove that the impact of data heterogeneity diminishes as the width of neural networks increases, ultimately vanishing when the width approaches infinity.<n>In the infinite-width regime, we further prove that both the global and local models in FedAvg behave as linear models, and that FedAvg achieves the same generalization performance as centralized learning with the same number of GD iterations.
arXiv Detail & Related papers (2025-08-18T02:22:55Z) - Auto-Compressing Networks [51.221103189527014]
We introduce Auto-compression Networks (ACNs), an architectural variant where long feedforward connections from each layer replace traditional short residual connections.<n>We show that ACNs exhibit enhanced noise compared to residual networks, superior performance in low-data settings, and mitigate catastrophic forgetting.<n>These findings establish ACNs as a practical approach to developing efficient neural architectures.
arXiv Detail & Related papers (2025-06-11T13:26:09Z) - Global Convergence and Rich Feature Learning in $L$-Layer Infinite-Width Neural Networks under $μ$P Parametrization [66.03821840425539]
In this paper, we investigate the training dynamics of $L$-layer neural networks using the tensor gradient program (SGD) framework.
We show that SGD enables these networks to learn linearly independent features that substantially deviate from their initial values.
This rich feature space captures relevant data information and ensures that any convergent point of the training process is a global minimum.
arXiv Detail & Related papers (2025-03-12T17:33:13Z) - BiDense: Binarization for Dense Prediction [62.70804353158387]
BiDense is a generalized binary neural network (BNN) designed for efficient and accurate dense prediction tasks.
BiDense incorporates two key techniques: the Distribution-adaptive Binarizer (DAB) and the Channel-adaptive Full-precision Bypass (CFB)
arXiv Detail & Related papers (2024-11-15T16:46:04Z) - An Enhanced Encoder-Decoder Network Architecture for Reducing Information Loss in Image Semantic Segmentation [6.596361762662328]
We introduce an innovative encoder-decoder network structure enhanced with residual connections.
Our approach employs a multi-residual connection strategy designed to preserve the intricate details across various image scales more effectively.
To enhance the convergence rate of network training and mitigate sample imbalance issues, we have devised a modified cross-entropy loss function.
arXiv Detail & Related papers (2024-05-26T05:15:53Z) - Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity.
Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z) - Analysis and Optimization of Wireless Federated Learning with Data
Heterogeneity [72.85248553787538]
This paper focuses on performance analysis and optimization for wireless FL, considering data heterogeneity, combined with wireless resource allocation.
We formulate the loss function minimization problem, under constraints on long-term energy consumption and latency, and jointly optimize client scheduling, resource allocation, and the number of local training epochs (CRE)
Experiments on real-world datasets demonstrate that the proposed algorithm outperforms other benchmarks in terms of the learning accuracy and energy consumption.
arXiv Detail & Related papers (2023-08-04T04:18:01Z) - CAFE: Learning to Condense Dataset by Aligning Features [72.99394941348757]
We propose a novel scheme to Condense dataset by Aligning FEatures (CAFE)
At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales.
We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art.
arXiv Detail & Related papers (2022-03-03T05:58:49Z) - Distribution-sensitive Information Retention for Accurate Binary Neural
Network [49.971345958676196]
We present a novel Distribution-sensitive Information Retention Network (DIR-Net) to retain the information of the forward activations and backward gradients.
Our DIR-Net consistently outperforms the SOTA binarization approaches under mainstream and compact architectures.
We conduct our DIR-Net on real-world resource-limited devices which achieves 11.1 times storage saving and 5.4 times speedup.
arXiv Detail & Related papers (2021-09-25T10:59:39Z) - Mitigating severe over-parameterization in deep convolutional neural
networks through forced feature abstraction and compression with an
entropy-based heuristic [7.503338065129185]
We propose an Entropy-Based Convolutional Layer Estimation (EBCLE) which is robust and simple.
We present empirical evidence to emphasize the relative effectiveness of broader, yet shallower models trained using the EBCLE.
arXiv Detail & Related papers (2021-06-27T10:34:39Z) - Drill the Cork of Information Bottleneck by Inputting the Most Important
Data [28.32769151293851]
How to efficiently train deep neural networks remains to be solved.
The information bottleneck (IB) theory claims that the optimization process consists of an initial fitting phase and the following compression phase.
We show that the fitting phase depicted in the IB theory will be boosted with a high signal-to-noise ratio if the typicality sampling is appropriately adopted.
arXiv Detail & Related papers (2021-05-15T09:20:36Z) - FG-Net: Fast Large-Scale LiDAR Point CloudsUnderstanding Network
Leveraging CorrelatedFeature Mining and Geometric-Aware Modelling [15.059508985699575]
FG-Net is a general deep learning framework for large-scale point clouds understanding without voxelizations.
We propose a deep convolutional neural network leveraging correlated feature mining and deformable convolution based geometric-aware modelling.
Our approaches outperform state-of-the-art approaches in terms of accuracy and efficiency.
arXiv Detail & Related papers (2020-12-17T08:20:09Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.