Related papers: Variational Depth Search in ResNets

Variational Depth Search in ResNets

URL: http://arxiv.org/abs/2002.02797v4
Date: Wed, 1 Apr 2020 17:59:13 GMT
Title: Variational Depth Search in ResNets
Authors: Javier Antor\'an, James Urquhart Allingham, Jos\'e Miguel Hern\'andez-Lobato
Abstract summary: One-shot neural architecture search allows joint learning of weights and network architecture, reducing computational cost. We limit our search space to the depth of residual networks and formulate an analytically tractable variational objective that allows for an unbiased approximate posterior over depths in one-shot. We compare our proposed method against manual search over network depths on the MNIST, Fashion-MNIST, SVHN datasets.
Score: 2.6763498831034043
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: One-shot neural architecture search allows joint learning of weights and network architecture, reducing computational cost. We limit our search space to the depth of residual networks and formulate an analytically tractable variational objective that allows for obtaining an unbiased approximate posterior over depths in one-shot. We propose a heuristic to prune our networks based on this distribution. We compare our proposed method against manual search over network depths on the MNIST, Fashion-MNIST, SVHN datasets. We find that pruned networks do not incur a loss in predictive performance, obtaining accuracies competitive with unpruned networks. Marginalising over depth allows us to obtain better-calibrated test-time uncertainty estimates than regular networks, in a single forward pass.

Related papers

Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning [57.3885832382455]
We show that introducing static network sparsity alone can unlock further scaling potential beyond dense counterparts with state-of-the-art architectures.<n>Our analysis reveals that, in contrast to naively scaling up dense DRL networks, such sparse networks achieve both higher parameter efficiency for network expressivity.
arXiv Detail & Related papers (2025-06-20T17:54:24Z)
Robust lEarned Shrinkage-Thresholding (REST): Robust unrolling for sparse recover [87.28082715343896]
We consider deep neural networks for solving inverse problems that are robust to forward model mis-specifications. We design a new robust deep neural network architecture by applying algorithm unfolding techniques to a robust version of the underlying recovery problem. The proposed REST network is shown to outperform state-of-the-art model-based and data-driven algorithms in both compressive sensing and radar imaging problems.
arXiv Detail & Related papers (2021-10-20T06:15:45Z)
Neural Architecture Search for Efficient Uncalibrated Deep Photometric Stereo [105.05232615226602]
We leverage differentiable neural architecture search (NAS) strategy to find uncalibrated PS architecture automatically. Experiments on the DiLiGenT dataset show that the automatically searched neural architectures performance compares favorably with the state-of-the-art uncalibrated PS methods.
arXiv Detail & Related papers (2021-10-11T21:22:17Z)
Semi-supervised Network Embedding with Differentiable Deep Quantisation [81.49184987430333]
We develop d-SNEQ, a differentiable quantisation method for network embedding. d-SNEQ incorporates a rank loss to equip the learned quantisation codes with rich high-order information. It is able to substantially compress the size of trained embeddings, thus reducing storage footprint and accelerating retrieval speed.
arXiv Detail & Related papers (2021-08-20T11:53:05Z)
Combined Depth Space based Architecture Search For Person Re-identification [70.86236888223569]
We aim to design a lightweight and suitable network for person re-identification (ReID) We propose a novel search space called Combined Depth Space (CDS), based on which we search for an efficient network architecture, which we call CDNet. We then propose a low-cost search strategy named the Top-k Sample Search strategy to make full use of the search space and avoid trapping in local optimal result.
arXiv Detail & Related papers (2021-04-09T02:40:01Z)
Uncertainty Quantification in Deep Residual Neural Networks [0.0]
Uncertainty quantification is an important and challenging problem in deep learning. Previous methods rely on dropout layers which are not present in modern deep architectures or batch normalization which is sensitive to batch sizes. We show that training residual networks using depth can be interpreted as a variational approximation to the posterior weights in neural networks.
arXiv Detail & Related papers (2020-07-09T16:05:37Z)
ESPN: Extremely Sparse Pruned Networks [50.436905934791035]
We show that a simple iterative mask discovery method can achieve state-of-the-art compression of very deep networks. Our algorithm represents a hybrid approach between single shot network pruning methods and Lottery-Ticket type approaches.
arXiv Detail & Related papers (2020-06-28T23:09:27Z)
Depth Uncertainty in Neural Networks [2.6763498831034043]
Existing methods for estimating uncertainty in deep learning tend to require multiple forward passes. By exploiting the sequential structure of feed-forward networks, we are able to both evaluate our training objective and make predictions with a single forward pass. We validate our approach on real-world regression and image classification tasks.
arXiv Detail & Related papers (2020-06-15T14:33:40Z)
Compact Neural Representation Using Attentive Network Pruning [1.0152838128195465]
We describe a Top-Down attention mechanism that is added to a Bottom-Up feedforward network to select important connections and subsequently prune redundant ones at all parametric layers. Our method not only introduces a novel hierarchical selection mechanism as the basis of pruning but also remains competitive with previous baseline methods in the experimental evaluation.
arXiv Detail & Related papers (2020-05-10T03:20:01Z)
A Mean-field Analysis of Deep ResNet and Beyond: Towards Provable Optimization Via Overparameterization From Depth [19.866928507243617]
Training deep neural networks with gradient descent (SGD) can often achieve zero training loss on real-world landscapes. We propose a new limit of infinity deep residual networks, which enjoys a good training in the sense that everyr is global.
arXiv Detail & Related papers (2020-03-11T20:14:47Z)
On Random Kernels of Residual Architectures [93.94469470368988]
We derive finite width and depth corrections for the Neural Tangent Kernel (NTK) of ResNets and DenseNets. Our findings show that in ResNets, convergence to the NTK may occur when depth and width simultaneously tend to infinity. In DenseNets, however, convergence of the NTK to its limit as the width tends to infinity is guaranteed.
arXiv Detail & Related papers (2020-01-28T16:47:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.