Related papers: Convolutional Neural Network Simplification with Progressive Retraining

Convolutional Neural Network Simplification with Progressive Retraining

URL: http://arxiv.org/abs/2101.04699v1
Date: Tue, 12 Jan 2021 19:05:42 GMT
Title: Convolutional Neural Network Simplification with Progressive Retraining
Authors: D. Osaku, J.F. Gomes, A.X. Falc\~ao
Abstract summary: Kernel pruning methods have been proposed to speed up, simplify, and improve explanation of convolutional neural network (CNN) models. We present new methods based on objective and subjective relevance criteria for kernel elimination in a layer-by-layer fashion.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Kernel pruning methods have been proposed to speed up, simplify, and improve explanation of convolutional neural network (CNN) models. However, the effectiveness of a simplified model is often below the original one. In this letter, we present new methods based on objective and subjective relevance criteria for kernel elimination in a layer-by-layer fashion. During the process, a CNN model is retrained only when the current layer is entirely simplified, by adjusting the weights from the next layer to the first one and preserving weights of subsequent layers not involved in the process. We call this strategy \emph{progressive retraining}, differently from kernel pruning methods that usually retrain the entire model after each simplification action -- e.g., the elimination of one or a few kernels. Our subjective relevance criterion exploits the ability of humans in recognizing visual patterns and improves the designer's understanding of the simplification process. The combination of suitable relevance criteria and progressive retraining shows that our methods can increase effectiveness with considerable model simplification. We also demonstrate that our methods can provide better results than two popular ones and another one from the state-of-the-art using four challenging image datasets.

Related papers

INDIGO+: A Unified INN-Guided Probabilistic Diffusion Algorithm for Blind and Non-Blind Image Restoration [22.19661915697775]
We propose a novel INN-guided probabilistic diffusion algorithm for non-blind and blind image restoration. INDIGO and BlindINDIGO combine the merits of the perfect reconstruction property of invertible neural networks (INN) with the strong generative capabilities of pre-trained diffusion models.
arXiv Detail & Related papers (2025-01-23T18:51:52Z)
Advancing Neural Network Performance through Emergence-Promoting Initialization Scheme [0.0]
We introduce a novel yet straightforward neural network initialization scheme. Inspired by the concept of emergence and leveraging the emergence measures proposed by Li (2023), our method adjusts layer-wise weight scaling factors to achieve higher emergence values. We demonstrate substantial improvements in both model accuracy and training speed, with and without batch normalization.
arXiv Detail & Related papers (2024-07-26T18:56:47Z)
Take A Shortcut Back: Mitigating the Gradient Vanishing for Training Spiking Neural Networks [15.691263438655842]
Spiking Neural Network (SNN) is a biologically inspired neural network infrastructure that has recently garnered significant attention. Training an SNN directly poses a challenge due to the undefined gradient of the firing spike process. We propose a shortcut back-propagation method in our paper, which advocates for transmitting the gradient directly from the loss to the shallow layers.
arXiv Detail & Related papers (2024-01-09T10:54:41Z)
Enhancing Surface Neural Implicits with Curvature-Guided Sampling and Uncertainty-Augmented Representations [37.42624848693373]
We introduce a method that directly digests depth images for the task of high-fidelity 3D reconstruction. A simple sampling strategy is proposed to generate highly effective training data. Despite its simplicity, our method outperforms a range of both classical and learning-based baselines.
arXiv Detail & Related papers (2023-06-03T12:23:17Z)
Boosting Low-Data Instance Segmentation by Unsupervised Pre-training with Saliency Prompt [103.58323875748427]
This work offers a novel unsupervised pre-training solution for low-data regimes. Inspired by the recent success of the Prompting technique, we introduce a new pre-training method that boosts QEIS models. Experimental results show that our method significantly boosts several QEIS models on three datasets.
arXiv Detail & Related papers (2023-02-02T15:49:03Z)
Improved Convergence Guarantees for Shallow Neural Networks [91.3755431537592]
We prove convergence of depth 2 neural networks, trained via gradient descent, to a global minimum. Our model has the following features: regression with quadratic loss function, fully connected feedforward architecture, RelU activations, Gaussian data instances, adversarial labels. They strongly suggest that, at least in our model, the convergence phenomenon extends well beyond the NTK regime''
arXiv Detail & Related papers (2022-12-05T14:47:52Z)
Towards Practical Control of Singular Values of Convolutional Layers [65.25070864775793]
Convolutional neural networks (CNNs) are easy to train, but their essential properties, such as generalization error and adversarial robustness, are hard to control. Recent research demonstrated that singular values of convolutional layers significantly affect such elusive properties. We offer a principled approach to alleviating constraints of the prior art at the expense of an insignificant reduction in layer expressivity.
arXiv Detail & Related papers (2022-11-24T19:09:44Z)
Adaptive Convolutional Dictionary Network for CT Metal Artifact Reduction [62.691996239590125]
We propose an adaptive convolutional dictionary network (ACDNet) for metal artifact reduction. Our ACDNet can automatically learn the prior for artifact-free CT images via training data and adaptively adjust the representation kernels for each input CT image. Our method inherits the clear interpretability of model-based methods and maintains the powerful representation ability of learning-based methods.
arXiv Detail & Related papers (2022-05-16T06:49:36Z)
FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories. We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z)
Meta Adversarial Perturbations [66.43754467275967]
We show the existence of a meta adversarial perturbation (MAP) MAP causes natural images to be misclassified with high probability after being updated through only a one-step gradient ascent update. We show that these perturbations are not only image-agnostic, but also model-agnostic, as a single perturbation generalizes well across unseen data points and different neural network architectures.
arXiv Detail & Related papers (2021-11-19T16:01:45Z)
Initialization and Regularization of Factorized Neural Layers [23.875225732697142]
We show how to initialize and regularize Factorized layers in deep nets. We show how these schemes lead to improved performance on both translation and unsupervised pre-training.
arXiv Detail & Related papers (2021-05-03T17:28:07Z)
Adaptive Signal Variances: CNN Initialization Through Modern Architectures [0.7646713951724012]
Deep convolutional neural networks (CNN) have achieved the unwavering confidence in its performance on image processing tasks. CNN practitioners widely understand the fact that the stability of learning depends on how to initialize the model parameters in each layer.
arXiv Detail & Related papers (2020-08-16T11:26:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.