Related papers: An Overview of Neural Network Compression

An Overview of Neural Network Compression

URL: http://arxiv.org/abs/2006.03669v2
Date: Sat, 1 Aug 2020 16:55:53 GMT
Title: An Overview of Neural Network Compression
Authors: James O' Neill
Abstract summary: In recent years there has been a resurgence in model compression techniques, particularly for deep convolutional neural networks and self-attention based networks such as the Transformer. This paper provides a timely overview of both old and current compression techniques for deep neural networks, including pruning, quantization, tensor decomposition, knowledge distillation and combinations thereof.
Score: 2.550900579709111
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Overparameterized networks trained to convergence have shown impressive performance in domains such as computer vision and natural language processing. Pushing state of the art on salient tasks within these domains corresponds to these models becoming larger and more difficult for machine learning practitioners to use given the increasing memory and storage requirements, not to mention the larger carbon footprint. Thus, in recent years there has been a resurgence in model compression techniques, particularly for deep convolutional neural networks and self-attention based networks such as the Transformer. Hence, this paper provides a timely overview of both old and current compression techniques for deep neural networks, including pruning, quantization, tensor decomposition, knowledge distillation and combinations thereof. We assume a basic familiarity with deep learning architectures\footnote{For an introduction to deep learning, see ~\citet{goodfellow2016deep}}, namely, Recurrent Neural Networks~\citep[(RNNs)][]{rumelhart1985learning,hochreiter1997long}, Convolutional Neural Networks~\citep{fukushima1980neocognitron}~\footnote{For an up to date overview see~\citet{khan2019survey}} and Self-Attention based networks~\citep{vaswani2017attention}\footnote{For a general overview of self-attention networks, see ~\citet{chaudhari2019attentive}.},\footnote{For more detail and their use in natural language processing, see~\citet{hu2019introductory}}. Most of the papers discussed are proposed in the context of at least one of these DNN architectures.

Related papers

Till the Layers Collapse: Compressing a Deep Neural Network through the Lenses of Batch Normalization Layers [5.008189006630566]
We introduce a method called textbfTill the textbfLayers textbfCollapse (TLC), which compresses deep neural networks through the lenses of batch normalization layers. We validate our method on popular models such as Swin-T, MobileNet-V2, and RoBERTa, across both image classification and natural language processing (NLP) tasks.
arXiv Detail & Related papers (2024-12-19T17:26:07Z)
Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning. Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z)
A Spectral Condition for Feature Learning [20.440553685976194]
Key challenge is to scale training so that a network's internal representations evolve nontrivially at all widths. We show that feature learning is achieved by scaling the spectral norm of weight and their updates.
arXiv Detail & Related papers (2023-10-26T23:17:39Z)
Hyperbolic Convolutional Neural Networks [14.35618845900589]
Using non-Euclidean space for embedding data might result in more robust and explainable models. We hypothesize that ability of hyperbolic space to capture hierarchy in the data would lead to better performance.
arXiv Detail & Related papers (2023-08-29T21:20:16Z)
A Note on the Implicit Bias Towards Minimal Depth of Deep Neural Networks [11.739219085726006]
A central aspect that enables the success of these systems is the ability to train deep models instead of wide shallow ones. While training deep neural networks repetitively achieves superior performance against their shallow counterparts, an understanding of the role of depth in representation learning is still lacking.
arXiv Detail & Related papers (2022-02-18T05:21:28Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
Creating Powerful and Interpretable Models withRegression Networks [2.2049183478692584]
We propose a novel architecture, Regression Networks, which combines the power of neural networks with the understandability of regression analysis. We demonstrate that the models exceed the state-of-the-art performance of interpretable models on several benchmark datasets.
arXiv Detail & Related papers (2021-07-30T03:37:00Z)
A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation. Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z)
Hyperbolic Deep Neural Networks: A Survey [31.04110049167551]
We refer to the model as hyperbolic deep neural network in this paper. To stimulate future research, this paper presents acoherent and comprehensive review of the literature around the neural components in the construction of hyperbolic deep neuralnetworks.
arXiv Detail & Related papers (2021-01-12T15:55:16Z)
Overcoming Catastrophic Forgetting in Graph Neural Networks [50.900153089330175]
Catastrophic forgetting refers to the tendency that a neural network "forgets" the previous learned knowledge upon learning new tasks. We propose a novel scheme dedicated to overcoming this problem and hence strengthen continual learning in graph neural networks (GNNs) At the heart of our approach is a generic module, termed as topology-aware weight preserving(TWP)
arXiv Detail & Related papers (2020-12-10T22:30:25Z)
Spatio-Temporal Inception Graph Convolutional Networks for Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition. Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z)
Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis. By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner. This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.