Convolutional Neural Network Simplification with Progressive Retraining
- URL: http://arxiv.org/abs/2101.04699v1
- Date: Tue, 12 Jan 2021 19:05:42 GMT
- Title: Convolutional Neural Network Simplification with Progressive Retraining
- Authors: D. Osaku, J.F. Gomes, A.X. Falc\~ao
- Abstract summary: Kernel pruning methods have been proposed to speed up, simplify, and improve explanation of convolutional neural network (CNN) models.
We present new methods based on objective and subjective relevance criteria for kernel elimination in a layer-by-layer fashion.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Kernel pruning methods have been proposed to speed up, simplify, and improve
explanation of convolutional neural network (CNN) models. However, the
effectiveness of a simplified model is often below the original one. In this
letter, we present new methods based on objective and subjective relevance
criteria for kernel elimination in a layer-by-layer fashion. During the
process, a CNN model is retrained only when the current layer is entirely
simplified, by adjusting the weights from the next layer to the first one and
preserving weights of subsequent layers not involved in the process. We call
this strategy \emph{progressive retraining}, differently from kernel pruning
methods that usually retrain the entire model after each simplification action
-- e.g., the elimination of one or a few kernels. Our subjective relevance
criterion exploits the ability of humans in recognizing visual patterns and
improves the designer's understanding of the simplification process. The
combination of suitable relevance criteria and progressive retraining shows
that our methods can increase effectiveness with considerable model
simplification. We also demonstrate that our methods can provide better results
than two popular ones and another one from the state-of-the-art using four
challenging image datasets.
Related papers
- Advancing Neural Network Performance through Emergence-Promoting Initialization Scheme [0.0]
We introduce a novel yet straightforward neural network initialization scheme.
Inspired by the concept of emergence and leveraging the emergence measures proposed by Li (2023), our method adjusts layer-wise weight scaling factors to achieve higher emergence values.
We demonstrate substantial improvements in both model accuracy and training speed, with and without batch normalization.
arXiv Detail & Related papers (2024-07-26T18:56:47Z) - Take A Shortcut Back: Mitigating the Gradient Vanishing for Training Spiking Neural Networks [15.691263438655842]
Spiking Neural Network (SNN) is a biologically inspired neural network infrastructure that has recently garnered significant attention.
Training an SNN directly poses a challenge due to the undefined gradient of the firing spike process.
We propose a shortcut back-propagation method in our paper, which advocates for transmitting the gradient directly from the loss to the shallow layers.
arXiv Detail & Related papers (2024-01-09T10:54:41Z) - Enhancing Surface Neural Implicits with Curvature-Guided Sampling and Uncertainty-Augmented Representations [37.42624848693373]
We introduce a method that directly digests depth images for the task of high-fidelity 3D reconstruction.
A simple sampling strategy is proposed to generate highly effective training data.
Despite its simplicity, our method outperforms a range of both classical and learning-based baselines.
arXiv Detail & Related papers (2023-06-03T12:23:17Z) - Boosting Low-Data Instance Segmentation by Unsupervised Pre-training
with Saliency Prompt [103.58323875748427]
This work offers a novel unsupervised pre-training solution for low-data regimes.
Inspired by the recent success of the Prompting technique, we introduce a new pre-training method that boosts QEIS models.
Experimental results show that our method significantly boosts several QEIS models on three datasets.
arXiv Detail & Related papers (2023-02-02T15:49:03Z) - Improved Convergence Guarantees for Shallow Neural Networks [91.3755431537592]
We prove convergence of depth 2 neural networks, trained via gradient descent, to a global minimum.
Our model has the following features: regression with quadratic loss function, fully connected feedforward architecture, RelU activations, Gaussian data instances, adversarial labels.
They strongly suggest that, at least in our model, the convergence phenomenon extends well beyond the NTK regime''
arXiv Detail & Related papers (2022-12-05T14:47:52Z) - Towards Practical Control of Singular Values of Convolutional Layers [65.25070864775793]
Convolutional neural networks (CNNs) are easy to train, but their essential properties, such as generalization error and adversarial robustness, are hard to control.
Recent research demonstrated that singular values of convolutional layers significantly affect such elusive properties.
We offer a principled approach to alleviating constraints of the prior art at the expense of an insignificant reduction in layer expressivity.
arXiv Detail & Related papers (2022-11-24T19:09:44Z) - Adaptive Convolutional Dictionary Network for CT Metal Artifact
Reduction [62.691996239590125]
We propose an adaptive convolutional dictionary network (ACDNet) for metal artifact reduction.
Our ACDNet can automatically learn the prior for artifact-free CT images via training data and adaptively adjust the representation kernels for each input CT image.
Our method inherits the clear interpretability of model-based methods and maintains the powerful representation ability of learning-based methods.
arXiv Detail & Related papers (2022-05-16T06:49:36Z) - FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories.
We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z) - Meta Adversarial Perturbations [66.43754467275967]
We show the existence of a meta adversarial perturbation (MAP)
MAP causes natural images to be misclassified with high probability after being updated through only a one-step gradient ascent update.
We show that these perturbations are not only image-agnostic, but also model-agnostic, as a single perturbation generalizes well across unseen data points and different neural network architectures.
arXiv Detail & Related papers (2021-11-19T16:01:45Z) - Initialization and Regularization of Factorized Neural Layers [23.875225732697142]
We show how to initialize and regularize Factorized layers in deep nets.
We show how these schemes lead to improved performance on both translation and unsupervised pre-training.
arXiv Detail & Related papers (2021-05-03T17:28:07Z) - Adaptive Signal Variances: CNN Initialization Through Modern
Architectures [0.7646713951724012]
Deep convolutional neural networks (CNN) have achieved the unwavering confidence in its performance on image processing tasks.
CNN practitioners widely understand the fact that the stability of learning depends on how to initialize the model parameters in each layer.
arXiv Detail & Related papers (2020-08-16T11:26:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.