Structured IB: Improving Information Bottleneck with Structured Feature Learning
- URL: http://arxiv.org/abs/2412.08222v1
- Date: Wed, 11 Dec 2024 09:17:45 GMT
- Title: Structured IB: Improving Information Bottleneck with Structured Feature Learning
- Authors: Hanzhe Yang, Youlong Wu, Dingzhu Wen, Yong Zhou, Yuanming Shi,
- Abstract summary: We introduce Structured IB, a framework for investigating potential structured features.
Our experiments demonstrate superior prediction accuracy and task-relevant information compared to the original IB Lagrangian method.
- Score: 32.774660308233635
- License:
- Abstract: The Information Bottleneck (IB) principle has emerged as a promising approach for enhancing the generalization, robustness, and interpretability of deep neural networks, demonstrating efficacy across image segmentation, document clustering, and semantic communication. Among IB implementations, the IB Lagrangian method, employing Lagrangian multipliers, is widely adopted. While numerous methods for the optimizations of IB Lagrangian based on variational bounds and neural estimators are feasible, their performance is highly dependent on the quality of their design, which is inherently prone to errors. To address this limitation, we introduce Structured IB, a framework for investigating potential structured features. By incorporating auxiliary encoders to extract missing informative features, we generate more informative representations. Our experiments demonstrate superior prediction accuracy and task-relevant information preservation compared to the original IB Lagrangian method, even with reduced network size.
Related papers
- Rethinking Latent Representations in Behavior Cloning: An Information Bottleneck Approach for Robot Manipulation [34.46089300038851]
Behavior Cloning (BC) is a widely adopted visual imitation learning method in robot manipulation.
We introduce mutual information to quantify and mitigate redundancy in latent representations.
This work presents the first comprehensive study on redundancy in latent representations across various methods, backbones, and experimental settings.
arXiv Detail & Related papers (2025-02-05T03:13:04Z) - Quantized and Interpretable Learning Scheme for Deep Neural Networks in Classification Task [0.0]
We introduce an approach that combines saliency-guided training with quantization techniques to create an interpretable and resource-efficient model.
Our results demonstrate that the combined use of saliency-guided training and PACT-based quantization not only maintains classification performance but also produces models that are significantly more efficient and interpretable.
arXiv Detail & Related papers (2024-12-05T06:34:06Z) - Preserving Information: How does Topological Data Analysis improve Neural Network performance? [0.0]
We introduce a method for integrating Topological Data Analysis (TDA) with Convolutional Neural Networks (CNN) in the context of image recognition.
Our approach, further referred to as Vector Stitching, involves combining raw image data with additional topological information.
The results of our experiments highlight the potential of incorporating results of additional data analysis into the network's inference process.
arXiv Detail & Related papers (2024-11-27T14:56:05Z) - BiDense: Binarization for Dense Prediction [62.70804353158387]
BiDense is a generalized binary neural network (BNN) designed for efficient and accurate dense prediction tasks.
BiDense incorporates two key techniques: the Distribution-adaptive Binarizer (DAB) and the Channel-adaptive Full-precision Bypass (CFB)
arXiv Detail & Related papers (2024-11-15T16:46:04Z) - Learning to Model Graph Structural Information on MLPs via Graph Structure Self-Contrasting [50.181824673039436]
We propose a Graph Structure Self-Contrasting (GSSC) framework that learns graph structural information without message passing.
The proposed framework is based purely on Multi-Layer Perceptrons (MLPs), where the structural information is only implicitly incorporated as prior knowledge.
It first applies structural sparsification to remove potentially uninformative or noisy edges in the neighborhood, and then performs structural self-contrasting in the sparsified neighborhood to learn robust node representations.
arXiv Detail & Related papers (2024-09-09T12:56:02Z) - Enhancing Adversarial Transferability via Information Bottleneck Constraints [18.363276470822427]
We propose a framework for performing black-box transferable adversarial attacks named IBTA.
To overcome the challenge of unoptimizable mutual information, we propose a simple and efficient mutual information lower bound (MILB) for approximating computation.
Our experiments on the ImageNet dataset well demonstrate the efficiency and scalability of IBTA and derived MILB.
arXiv Detail & Related papers (2024-06-08T17:25:31Z) - Tighter Bounds on the Information Bottleneck with Application to Deep
Learning [6.206127662604578]
Deep Neural Nets (DNNs) learn latent representations induced by their downstream task, objective function, and other parameters.
The Information Bottleneck (IB) provides a hypothetically optimal framework for data modeling, yet it is often intractable.
Recent efforts combined DNNs with the IB by applying VAE-inspired variational methods to approximate bounds on mutual information, resulting in improved robustness to adversarial attacks.
arXiv Detail & Related papers (2024-02-12T13:24:32Z) - Information-Theoretic Odometry Learning [83.36195426897768]
We propose a unified information theoretic framework for learning-motivated methods aimed at odometry estimation.
The proposed framework provides an elegant tool for performance evaluation and understanding in information-theoretic language.
arXiv Detail & Related papers (2022-03-11T02:37:35Z) - Generative Counterfactuals for Neural Networks via Attribute-Informed
Perturbation [51.29486247405601]
We design a framework to generate counterfactuals for raw data instances with the proposed Attribute-Informed Perturbation (AIP)
By utilizing generative models conditioned with different attributes, counterfactuals with desired labels can be obtained effectively and efficiently.
Experimental results on real-world texts and images demonstrate the effectiveness, sample quality as well as efficiency of our designed framework.
arXiv Detail & Related papers (2021-01-18T08:37:13Z) - Graph Information Bottleneck [77.21967740646784]
Graph Neural Networks (GNNs) provide an expressive way to fuse information from network structure and node features.
Inheriting from the general Information Bottleneck (IB), GIB aims to learn the minimal sufficient representation for a given task.
We show that our proposed models are more robust than state-of-the-art graph defense models.
arXiv Detail & Related papers (2020-10-24T07:13:00Z) - A Trainable Optimal Transport Embedding for Feature Aggregation and its
Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference.
Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.