Induced Feature Selection by Structured Pruning
- URL: http://arxiv.org/abs/2303.10999v1
- Date: Mon, 20 Mar 2023 10:29:35 GMT
- Title: Induced Feature Selection by Structured Pruning
- Authors: Nathan Hubens, Victor Delvigne, Matei Mancas, Bernard Gosselin, Marius
Preda, Titus Zaharia
- Abstract summary: We go one step further by imposing sparsity jointly on the weights and on the input data.
It is possible to achieve additional gains in terms of total parameters and in FLOPs by performing pruning on input data.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The advent of sparsity inducing techniques in neural networks has been of a
great help in the last few years. Indeed, those methods allowed to find lighter
and faster networks, able to perform more efficiently in resource-constrained
environment such as mobile devices or highly requested servers. Such a sparsity
is generally imposed on the weights of neural networks, reducing the footprint
of the architecture. In this work, we go one step further by imposing sparsity
jointly on the weights and on the input data. This can be achieved following a
three-step process: 1) impose a certain structured sparsity on the weights of
the network; 2) track back input features corresponding to zeroed blocks of
weight; 3) remove useless weights and input features and retrain the network.
Performing pruning both on the network and on input data not only allows for
extreme reduction in terms of parameters and operations but can also serve as
an interpretation process. Indeed, with the help of data pruning, we now have
information about which input feature is useful for the network to keep its
performance. Experiments conducted on a variety of architectures and datasets:
MLP validated on MNIST, CIFAR10/100 and ConvNets (VGG16 and ResNet18),
validated on CIFAR10/100 and CALTECH101 respectively, show that it is possible
to achieve additional gains in terms of total parameters and in FLOPs by
performing pruning on input data, while also increasing accuracy.
Related papers
- Learning to Compose SuperWeights for Neural Parameter Allocation Search [61.078949532440724]
We show that our approach can generate parameters for many network using the same set of weights.
This enables us to support tasks like efficient ensembling and anytime prediction.
arXiv Detail & Related papers (2023-12-03T04:20:02Z) - Neural Network Pruning by Gradient Descent [7.427858344638741]
We introduce a novel and straightforward neural network pruning framework that incorporates the Gumbel-Softmax technique.
We demonstrate its exceptional compression capability, maintaining high accuracy on the MNIST dataset with only 0.15% of the original network parameters.
We believe our method opens a promising new avenue for deep learning pruning and the creation of interpretable machine learning systems.
arXiv Detail & Related papers (2023-11-21T11:12:03Z) - WeightMom: Learning Sparse Networks using Iterative Momentum-based
pruning [0.0]
We propose a weight based pruning approach in which the weights are pruned gradually based on their momentum of the previous iterations.
We evaluate our approach on networks such as AlexNet, VGG16 and ResNet50 with image classification datasets such as CIFAR-10 and CIFAR-100.
arXiv Detail & Related papers (2022-08-11T07:13:59Z) - Federated Dynamic Sparse Training: Computing Less, Communicating Less,
Yet Learning Better [88.28293442298015]
Federated learning (FL) enables distribution of machine learning workloads from the cloud to resource-limited edge devices.
We develop, implement, and experimentally validate a novel FL framework termed Federated Dynamic Sparse Training (FedDST)
FedDST is a dynamic process that extracts and trains sparse sub-networks from the target full network.
arXiv Detail & Related papers (2021-12-18T02:26:38Z) - An Experimental Study of the Impact of Pre-training on the Pruning of a
Convolutional Neural Network [0.0]
In recent years, deep neural networks have known a wide success in various application domains.
Deep neural networks usually involve a large number of parameters, which correspond to the weights of the network.
The pruning methods notably attempt to reduce the size of the parameter set, by identifying and removing the irrelevant weights.
arXiv Detail & Related papers (2021-12-15T16:02:15Z) - DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and
Transformers [105.74546828182834]
We show a hardware-efficient dynamic inference regime, named dynamic weight slicing, which adaptively slice a part of network parameters for inputs with diverse difficulty levels.
We present dynamic slimmable network (DS-Net) and dynamic slice-able network (DS-Net++) by input-dependently adjusting filter numbers of CNNs and multiple dimensions in both CNNs and transformers.
arXiv Detail & Related papers (2021-09-21T09:57:21Z) - Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch [75.69506249886622]
Sparsity in Deep Neural Networks (DNNs) has been widely studied to compress and accelerate the models on resource-constrained environments.
In this paper, we are the first to study training from scratch an N:M fine-grained structured sparse network.
arXiv Detail & Related papers (2021-02-08T05:55:47Z) - Enabling certification of verification-agnostic networks via
memory-efficient semidefinite programming [97.40955121478716]
We propose a first-order dual SDP algorithm that requires memory only linear in the total number of network activations.
We significantly improve L-inf verified robust accuracy from 1% to 88% and 6% to 40% respectively.
We also demonstrate tight verification of a quadratic stability specification for the decoder of a variational autoencoder.
arXiv Detail & Related papers (2020-10-22T12:32:29Z) - HALO: Learning to Prune Neural Networks with Shrinkage [5.283963846188862]
Deep neural networks achieve state-of-the-art performance in a variety of tasks by extracting a rich set of features from unstructured data.
Modern techniques for inducing sparsity and reducing model size are (1) network pruning, (2) training with a sparsity inducing penalty, and (3) training a binary mask jointly with the weights of the network.
We present a novel penalty called Hierarchical Adaptive Lasso which learns to adaptively sparsify weights of a given network via trainable parameters.
arXiv Detail & Related papers (2020-08-24T04:08:48Z) - Principal Component Networks: Parameter Reduction Early in Training [10.14522349959932]
We show how to find small networks that exhibit the same performance as their over parameterized counterparts.
We use PCA to find a basis of high variance for layer inputs and represent layer weights using these directions.
We also show that ResNet-20 PCNs outperform deep ResNet-110 networks while training faster.
arXiv Detail & Related papers (2020-06-23T21:40:24Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.