Understanding the Effect of the Long Tail on Neural Network Compression
- URL: http://arxiv.org/abs/2306.06238v3
- Date: Tue, 27 Jun 2023 23:14:16 GMT
- Title: Understanding the Effect of the Long Tail on Neural Network Compression
- Authors: Harvey Dam, Vinu Joseph, Aditya Bhaskara, Ganesh Gopalakrishnan,
Saurav Muralidharan, Michael Garland
- Abstract summary: We study the "long tail" phenomenon in computer vision datasets observed by Feldman, et al.
As compression limits the capacity of a network (and hence also its ability to memorize), we study the question: are mismatches between the full and compressed models correlated with the memorized training data?
- Score: 9.819486253052528
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Network compression is now a mature sub-field of neural network research:
over the last decade, significant progress has been made towards reducing the
size of models and speeding up inference, while maintaining the classification
accuracy. However, many works have observed that focusing on just the overall
accuracy can be misguided. E.g., it has been shown that mismatches between the
full and compressed models can be biased towards under-represented classes.
This raises the important research question, can we achieve network compression
while maintaining "semantic equivalence" with the original network? In this
work, we study this question in the context of the "long tail" phenomenon in
computer vision datasets observed by Feldman, et al. They argue that
memorization of certain inputs (appropriately defined) is essential to
achieving good generalization. As compression limits the capacity of a network
(and hence also its ability to memorize), we study the question: are mismatches
between the full and compressed models correlated with the memorized training
data? We present positive evidence in this direction for image classification
tasks, by considering different base architectures and compression schemes.
Related papers
- Network Degeneracy as an Indicator of Training Performance: Comparing
Finite and Infinite Width Angle Predictions [3.04585143845864]
We show that as networks get deeper and deeper, they are more susceptible to becoming degenerate.
We use a simple algorithm that can accurately predict the level of degeneracy for any given fully connected ReLU network architecture.
arXiv Detail & Related papers (2023-06-02T13:02:52Z) - Isometric Representations in Neural Networks Improve Robustness [0.0]
We train neural networks to perform classification while simultaneously maintaining within-class metric structure.
We verify that isometric regularization improves the robustness to adversarial attacks on MNIST.
arXiv Detail & Related papers (2022-11-02T16:18:18Z) - A Theoretical Understanding of Neural Network Compression from Sparse
Linear Approximation [37.525277809849776]
The goal of model compression is to reduce the size of a large neural network while retaining a comparable performance.
We use sparsity-sensitive $ell_q$-norm to characterize compressibility and provide a relationship between soft sparsity of the weights in the network and the degree of compression.
We also develop adaptive algorithms for pruning each neuron in the network informed by our theory.
arXiv Detail & Related papers (2022-06-11T20:10:35Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - Compact representations of convolutional neural networks via weight
pruning and quantization [63.417651529192014]
We propose a novel storage format for convolutional neural networks (CNNs) based on source coding and leveraging both weight pruning and quantization.
We achieve a reduction of space occupancy up to 0.6% on fully connected layers and 5.44% on the whole network, while performing at least as competitive as the baseline.
arXiv Detail & Related papers (2021-08-28T20:39:54Z) - Redundant representations help generalization in wide neural networks [71.38860635025907]
We study the last hidden layer representations of various state-of-the-art convolutional neural networks.
We find that if the last hidden representation is wide enough, its neurons tend to split into groups that carry identical information, and differ from each other only by statistically independent noise.
arXiv Detail & Related papers (2021-06-07T10:18:54Z) - Permute, Quantize, and Fine-tune: Efficient Compression of Neural
Networks [70.0243910593064]
Key to success of vector quantization is deciding which parameter groups should be compressed together.
In this paper we make the observation that the weights of two adjacent layers can be permuted while expressing the same function.
We then establish a connection to rate-distortion theory and search for permutations that result in networks that are easier to compress.
arXiv Detail & Related papers (2020-10-29T15:47:26Z) - Attribution Preservation in Network Compression for Reliable Network
Interpretation [81.84564694303397]
Neural networks embedded in safety-sensitive applications rely on input attribution for hindsight analysis and network compression to reduce its size for edge-computing.
We show that these seemingly unrelated techniques conflict with each other as network compression deforms the produced attributions.
This phenomenon arises due to the fact that conventional network compression methods only preserve the predictions of the network while ignoring the quality of the attributions.
arXiv Detail & Related papers (2020-10-28T16:02:31Z) - The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network
Architectures [179.66117325866585]
We investigate a design space that is usually overlooked, i.e. adjusting the channel configurations of predefined networks.
We find that this adjustment can be achieved by shrinking widened baseline networks and leads to superior performance.
Experiments are conducted on various networks and datasets for image classification, visual tracking and image restoration.
arXiv Detail & Related papers (2020-06-29T17:59:26Z) - Adaptive Estimators Show Information Compression in Deep Neural Networks [2.578242050187029]
The information bottleneck theory proposes that neural networks achieve good generalization by compressing their representations to disregard information that is not relevant to the task.
In this paper we develop more robust mutual information estimation techniques, that adapt to hidden activity of neural networks.
We show that saturation of the activation function is not required for compression, and the amount of compression varies between different activation functions.
arXiv Detail & Related papers (2019-02-24T23:41:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.