An Effective Anti-Aliasing Approach for Residual Networks
- URL: http://arxiv.org/abs/2011.10675v1
- Date: Fri, 20 Nov 2020 22:55:57 GMT
- Title: An Effective Anti-Aliasing Approach for Residual Networks
- Authors: Cristina Vasconcelos, Hugo Larochelle, Vincent Dumoulin, Nicolas Le
Roux, Ross Goroshin
- Abstract summary: Frequency aliasing is a phenomenon that may occur when sub-sampling any signal, such as an image or feature map, causing distortion in the sub-sampled output.
We show that we can mitigate this effect by placing non-trainable blur filters and using smooth activation functions at key locations.
These simple architectural changes lead to substantial improvements in out-of-distribution generalization on both image classification under natural corruptions on ImageNet-C and few-shot learning on Meta-Dataset.
- Score: 27.962502376542588
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image pre-processing in the frequency domain has traditionally played a vital
role in computer vision and was even part of the standard pipeline in the early
days of deep learning. However, with the advent of large datasets, many
practitioners concluded that this was unnecessary due to the belief that these
priors can be learned from the data itself. Frequency aliasing is a phenomenon
that may occur when sub-sampling any signal, such as an image or feature map,
causing distortion in the sub-sampled output. We show that we can mitigate this
effect by placing non-trainable blur filters and using smooth activation
functions at key locations, particularly where networks lack the capacity to
learn them. These simple architectural changes lead to substantial improvements
in out-of-distribution generalization on both image classification under
natural corruptions on ImageNet-C [10] and few-shot learning on Meta-Dataset
[17], without introducing additional trainable parameters and using the default
hyper-parameters of open source codebases.
Related papers
- Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning [49.275450836604726]
We present a novel frequency-based Self-Supervised Learning (SSL) approach that significantly enhances its efficacy for pre-training.
We employ a two-branch framework empowered by knowledge distillation, enabling the model to take both the filtered and original images as input.
arXiv Detail & Related papers (2024-09-16T15:10:07Z) - Deep Learning Based Speckle Filtering for Polarimetric SAR Images. Application to Sentinel-1 [51.404644401997736]
We propose a complete framework to remove speckle in polarimetric SAR images using a convolutional neural network.
Experiments show that the proposed approach offers exceptional results in both speckle reduction and resolution preservation.
arXiv Detail & Related papers (2024-08-28T10:07:17Z) - Inpainting Normal Maps for Lightstage data [3.1002416427168304]
This study introduces a novel method for inpainting normal maps using a generative adversarial network (GAN)
Our approach extends previous general image inpainting techniques, employing a bow tie-like generator network and a discriminator network, with alternating training phases.
Our findings suggest that the proposed model effectively generates high-quality, realistic inpainted normal maps, suitable for performance capture applications.
arXiv Detail & Related papers (2024-01-16T03:59:07Z) - DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image
Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments.
Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features.
Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z) - PRISTA-Net: Deep Iterative Shrinkage Thresholding Network for Coded
Diffraction Patterns Phase Retrieval [6.982256124089]
Phase retrieval is a challenge nonlinear inverse problem in computational imaging and image processing.
We have developed PRISTA-Net, a deep unfolding network based on the first-order iterative threshold threshold algorithm (ISTA)
All parameters in the proposed PRISTA-Net framework, including the nonlinear transformation, threshold, and step size, are learned-to-end instead of being set.
arXiv Detail & Related papers (2023-09-08T07:37:15Z) - Frequency Dropout: Feature-Level Regularization via Randomized Filtering [24.53978165468098]
Deep convolutional neural networks are susceptible to picking up spurious correlations from the training signal.
We propose a training strategy, Frequency Dropout, to prevent convolutional neural networks from learning frequency-specific imaging features.
Our results suggest that the proposed approach does not only improve predictive accuracy but also improves robustness against domain shift.
arXiv Detail & Related papers (2022-09-20T16:42:21Z) - Deep Convolutional Pooling Transformer for Deepfake Detection [54.10864860009834]
We propose a deep convolutional Transformer to incorporate decisive image features both locally and globally.
Specifically, we apply convolutional pooling and re-attention to enrich the extracted features and enhance efficacy.
The proposed solution consistently outperforms several state-of-the-art baselines on both within- and cross-dataset experiments.
arXiv Detail & Related papers (2022-09-12T15:05:41Z) - Impact of Aliasing on Generalization in Deep Convolutional Networks [29.41652467340308]
We investigate the impact of aliasing on generalization in Deep Convolutional Networks.
We show how to mitigate aliasing by inserting non-trainable low-pass filters at key locations.
arXiv Detail & Related papers (2021-08-07T17:12:03Z) - TFill: Image Completion via a Transformer-Based Architecture [69.62228639870114]
We propose treating image completion as a directionless sequence-to-sequence prediction task.
We employ a restrictive CNN with small and non-overlapping RF for token representation.
In a second phase, to improve appearance consistency between visible and generated regions, a novel attention-aware layer (AAL) is introduced.
arXiv Detail & Related papers (2021-04-02T01:42:01Z) - Generic Perceptual Loss for Modeling Structured Output Dependencies [78.59700528239141]
We show that, what matters is the network structure instead of the trained weights.
We demonstrate that a randomly-weighted deep CNN can be used to model the structured dependencies of outputs.
arXiv Detail & Related papers (2021-03-18T23:56:07Z) - $\P$ILCRO: Making Importance Landscapes Flat Again [7.047473967702792]
This paper shows that most of the existing convolutional architectures define, at initialisation, a specific feature importance landscape.
We derive the P-objective, or PILCRO for Pixel-wise Landscape Curvature Regularised Objective.
We show that P-regularised versions of popular computer vision networks have a flat importance landscape, train faster, result in a better accuracy and are more robust to noise at test time.
arXiv Detail & Related papers (2020-01-27T11:20:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.