Scale-invariant Gaussian derivative residual networks
- URL: http://arxiv.org/abs/2603.02843v1
- Date: Tue, 03 Mar 2026 10:39:41 GMT
- Title: Scale-invariant Gaussian derivative residual networks
- Authors: Andrzej Perzanowski, Tony Lindeberg,
- Abstract summary: Generalisation across image scales remains a fundamental challenge for deep networks.<n>We present provably scale-invariant Gaussian derivative residual networks (GaussDerResNets)<n>We show that GaussDerResNets have strong scale generalisation and scale selection properties on rescaled datasets.
- Score: 4.554894288663752
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generalisation across image scales remains a fundamental challenge for deep networks, which often fail to handle images at scales not seen during training (the out-of-distribution problem). In this paper, we present provably scale-invariant Gaussian derivative residual networks (GaussDerResNets), constructed out of scale-covariant Gaussian derivative residual blocks coupled in cascade, aimed at addressing this problem. By adding residual skip connections to the previous notion of Gaussian derivative layers, deeper networks with substantially increased accuracy can be constructed, while preserving very good scale generalisation properties at the higher level of accuracy. Explicit proofs are provided regarding the underlying scale-covariant and scale-invariant properties in arbitrary dimensions. To analyse the ability of GaussDerResNets to generalise to new scales, we apply them on the new rescaled version of the STL-10 dataset, where training is done at a single fixed scale and evaluation is performed on multiple copies of the test set, each rescaled to a single distinct spatial scale, with scale factors extending over a range of 4. We also conduct similar systematic experiments on the rescaled versions of Fashion-MNIST and CIFAR-10 datasets. Experimentally, we demonstrate that the GaussDerResNets have strong scale generalisation and scale selection properties on all the three rescaled datasets. In our ablation studies, we investigate different architectural variants of GaussDerResNets, demonstrating that basing the architecture on depthwise-separable convolutions allows for decreasing both the number of parameters and the amount of computations, with reasonably maintained accuracy and scale generalisation.
Related papers
- Non-Singularity of the Gradient Descent map for Neural Networks with Piecewise Analytic Activations [53.348574336527854]
We investigate the neural network map as a function on the space of weights and biases.<n>We prove, for the first time, the non-singularity of the gradient descent (GD) map on the loss landscape of realistic neural network architectures.
arXiv Detail & Related papers (2025-10-28T14:34:33Z) - Scale generalisation properties of extended scale-covariant and scale-invariant Gaussian derivative networks on image datasets with spatial scaling variations [0.46040036610482665]
GaussDerNets are evaluated on new rescaled versions of the Fashion-MNIST and the CIFAR-10 datasets.<n>We first experimentally demonstrate that the GaussDerNets have quite good scale generalisation properties on the new datasets.<n>We also show that regularisation during training, by applying dropout across the scale channels, improves both the performance and the scale generalisation.
arXiv Detail & Related papers (2024-09-17T12:51:04Z) - WiNet: Wavelet-based Incremental Learning for Efficient Medical Image Registration [68.25711405944239]
Deep image registration has demonstrated exceptional accuracy and fast inference.
Recent advances have adopted either multiple cascades or pyramid architectures to estimate dense deformation fields in a coarse-to-fine manner.
We introduce a model-driven WiNet that incrementally estimates scale-wise wavelet coefficients for the displacement/velocity field across various scales.
arXiv Detail & Related papers (2024-07-18T11:51:01Z) - Asymptotics of Learning with Deep Structured (Random) Features [9.366617422860543]
For a large class of feature maps we provide a tight characterisation of the test error associated with learning the readout layer.
In some cases our results can capture feature maps learned by deep, finite-width neural networks trained under gradient descent.
arXiv Detail & Related papers (2024-02-21T18:35:27Z) - Deep Neural Networks with Efficient Guaranteed Invariances [77.99182201815763]
We address the problem of improving the performance and in particular the sample complexity of deep neural networks.
Group-equivariant convolutions are a popular approach to obtain equivariant representations.
We propose a multi-stream architecture, where each stream is invariant to a different transformation.
arXiv Detail & Related papers (2023-03-02T20:44:45Z) - Differentially private training of residual networks with scale
normalisation [64.60453677988517]
We investigate the optimal choice of replacement layer for Batch Normalisation (BN) in residual networks (ResNets)
We study the phenomenon of scale mixing in residual blocks, whereby the activations on the two branches are scaled differently.
arXiv Detail & Related papers (2022-03-01T09:56:55Z) - Improving the Sample-Complexity of Deep Classification Networks with
Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks.
We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems.
We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z) - Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks [83.58049517083138]
We consider a two-layer ReLU network trained via gradient descent.
We show that SGD is biased towards a simple solution.
We also provide empirical evidence that knots at locations distinct from the data points might occur.
arXiv Detail & Related papers (2021-11-03T15:14:20Z) - Scale-covariant and scale-invariant Gaussian derivative networks [0.0]
This paper presents a hybrid approach between scale-space theory and deep learning, where a deep learning architecture is constructed by coupling parameterized scale-space operations in cascade.
It is demonstrated that the resulting approach allows for scale generalization, enabling good performance for classifying patterns at scales not present in the training data.
arXiv Detail & Related papers (2020-11-30T13:15:10Z) - On the Predictability of Pruning Across Scales [29.94870276983399]
We show that the error of magnitude-pruned networks empirically follows a scaling law with interpretable coefficients that depend on the architecture and task.
As neural networks become ever larger and costlier to train, our findings suggest a framework for reasoning conceptually and analytically about a standard method for unstructured pruning.
arXiv Detail & Related papers (2020-06-18T15:41:46Z) - The Heavy-Tail Phenomenon in SGD [7.366405857677226]
We show that depending on the structure of the Hessian of the loss at the minimum, the SGD iterates will converge to a emphheavy-tailed stationary distribution.
We translate our results into insights about the behavior of SGD in deep learning.
arXiv Detail & Related papers (2020-06-08T16:43:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.