From CNNs to Shift-Invariant Twin Models Based on Complex Wavelets
- URL: http://arxiv.org/abs/2212.00394v3
- Date: Fri, 31 May 2024 15:08:21 GMT
- Title: From CNNs to Shift-Invariant Twin Models Based on Complex Wavelets
- Authors: Hubert Leterme, Kévin Polisano, Valérie Perrier, Karteek Alahari,
- Abstract summary: We replace the first-layer combination "real-valued convolutions + max pooling"
We claim that CMod and RMax produce comparable outputs when the convolution kernel is band-pass and oriented.
Our approach achieves superior accuracy on ImageNet and CIFAR-10 classification tasks.
- Score: 7.812210699650151
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel method to increase shift invariance and prediction accuracy in convolutional neural networks. Specifically, we replace the first-layer combination "real-valued convolutions + max pooling" (RMax) by "complex-valued convolutions + modulus" (CMod), which is stable to translations, or shifts. To justify our approach, we claim that CMod and RMax produce comparable outputs when the convolution kernel is band-pass and oriented (Gabor-like filter). In this context, CMod can therefore be considered as a stable alternative to RMax. To enforce this property, we constrain the convolution kernels to adopt such a Gabor-like structure. The corresponding architecture is called mathematical twin, because it employs a well-defined mathematical operator to mimic the behavior of the original, freely-trained model. Our approach achieves superior accuracy on ImageNet and CIFAR-10 classification tasks, compared to prior methods based on low-pass filtering. Arguably, our approach's emphasis on retaining high-frequency details contributes to a better balance between shift invariance and information preservation, resulting in improved performance. Furthermore, it has a lower computational cost and memory footprint than concurrent work, making it a promising solution for practical implementation.
Related papers
- GMConv: Modulating Effective Receptive Fields for Convolutional Kernels [52.50351140755224]
In convolutional neural networks, the convolutions are performed using a square kernel with a fixed N $times$ N receptive field (RF)
Inspired by the property that ERFs typically exhibit a Gaussian distribution, we propose a Gaussian Mask convolutional kernel (GMConv) in this work.
Our GMConv can directly replace the standard convolutions in existing CNNs and can be easily trained end-to-end by standard back-propagation.
arXiv Detail & Related papers (2023-02-09T10:17:17Z) - A Stable, Fast, and Fully Automatic Learning Algorithm for Predictive
Coding Networks [65.34977803841007]
Predictive coding networks are neuroscience-inspired models with roots in both Bayesian statistics and neuroscience.
We show how by simply changing the temporal scheduling of the update rule for the synaptic weights leads to an algorithm that is much more efficient and stable than the original one.
arXiv Detail & Related papers (2022-11-16T00:11:04Z) - Federated Learning Using Variance Reduced Stochastic Gradient for
Probabilistically Activated Agents [0.0]
This paper proposes an algorithm for Federated Learning (FL) with a two-layer structure that achieves both variance reduction and a faster convergence rate to an optimal solution in the setting where each agent has an arbitrary probability of selection in each iteration.
arXiv Detail & Related papers (2022-10-25T22:04:49Z) - On the Shift Invariance of Max Pooling Feature Maps in Convolutional
Neural Networks [0.0]
Subsampled convolutions with Gabor-like filters are prone to aliasing, causing sensitivity to small input shifts.
We highlight the crucial role played by the filter's frequency and orientation in achieving stability.
We experimentally validate our theory by considering a deterministic feature extractor based on the dual-tree complex wavelet packet transform.
arXiv Detail & Related papers (2022-09-19T08:15:30Z) - Can we achieve robustness from data alone? [0.7366405857677227]
Adversarial training and its variants have come to be the prevailing methods to achieve adversarially robust classification using neural networks.
We devise a meta-learning method for robust classification, that optimize the dataset prior to its deployment in a principled way.
Experiments on MNIST and CIFAR-10 demonstrate that the datasets we produce enjoy very high robustness against PGD attacks.
arXiv Detail & Related papers (2022-07-24T12:14:48Z) - Mutual-Information Based Few-Shot Classification [34.95314059362982]
We introduce Transductive Infomation Maximization (TIM) for few-shot learning.
Our method maximizes the mutual information between the query features and their label predictions for a given few-shot task.
We propose a new alternating-direction solver, which speeds up transductive inference over gradient-based optimization.
arXiv Detail & Related papers (2021-06-23T09:17:23Z) - Why Approximate Matrix Square Root Outperforms Accurate SVD in Global
Covariance Pooling? [59.820507600960745]
We propose a new GCP meta-layer that uses SVD in the forward pass, and Pad'e Approximants in the backward propagation to compute the gradients.
The proposed meta-layer has been integrated into different CNN models and achieves state-of-the-art performances on both large-scale and fine-grained datasets.
arXiv Detail & Related papers (2021-05-06T08:03:45Z) - Orthogonalizing Convolutional Layers with the Cayley Transform [83.73855414030646]
We propose and evaluate an alternative approach to parameterize convolutional layers that are constrained to be orthogonal.
We show that our method indeed preserves orthogonality to a high degree even for large convolutions.
arXiv Detail & Related papers (2021-04-14T23:54:55Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Gaussian MRF Covariance Modeling for Efficient Black-Box Adversarial
Attacks [86.88061841975482]
We study the problem of generating adversarial examples in a black-box setting, where we only have access to a zeroth order oracle.
We use this setting to find fast one-step adversarial attacks, akin to a black-box version of the Fast Gradient Sign Method(FGSM)
We show that the method uses fewer queries and achieves higher attack success rates than the current state of the art.
arXiv Detail & Related papers (2020-10-08T18:36:51Z) - Memory and Computation-Efficient Kernel SVM via Binary Embedding and
Ternary Model Coefficients [18.52747917850984]
Kernel approximation is widely used to scale up kernel SVM training and prediction.
Memory and computation costs of kernel approximation models are still too high if we want to deploy them on memory-limited devices.
We propose a novel memory and computation-efficient kernel SVM model by using both binary embedding and binary model coefficients.
arXiv Detail & Related papers (2020-10-06T09:41:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.