An Effective Anti-Aliasing Approach for Residual Networks
- URL: http://arxiv.org/abs/2011.10675v1
- Date: Fri, 20 Nov 2020 22:55:57 GMT
- Title: An Effective Anti-Aliasing Approach for Residual Networks
- Authors: Cristina Vasconcelos, Hugo Larochelle, Vincent Dumoulin, Nicolas Le
Roux, Ross Goroshin
- Abstract summary: Frequency aliasing is a phenomenon that may occur when sub-sampling any signal, such as an image or feature map, causing distortion in the sub-sampled output.
We show that we can mitigate this effect by placing non-trainable blur filters and using smooth activation functions at key locations.
These simple architectural changes lead to substantial improvements in out-of-distribution generalization on both image classification under natural corruptions on ImageNet-C and few-shot learning on Meta-Dataset.
- Score: 27.962502376542588
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image pre-processing in the frequency domain has traditionally played a vital
role in computer vision and was even part of the standard pipeline in the early
days of deep learning. However, with the advent of large datasets, many
practitioners concluded that this was unnecessary due to the belief that these
priors can be learned from the data itself. Frequency aliasing is a phenomenon
that may occur when sub-sampling any signal, such as an image or feature map,
causing distortion in the sub-sampled output. We show that we can mitigate this
effect by placing non-trainable blur filters and using smooth activation
functions at key locations, particularly where networks lack the capacity to
learn them. These simple architectural changes lead to substantial improvements
in out-of-distribution generalization on both image classification under
natural corruptions on ImageNet-C [10] and few-shot learning on Meta-Dataset
[17], without introducing additional trainable parameters and using the default
hyper-parameters of open source codebases.
Related papers
- Misalignment-Robust Frequency Distribution Loss for Image Transformation [51.0462138717502]
This paper aims to address a common challenge in deep learning-based image transformation methods, such as image enhancement and super-resolution.
We introduce a novel and simple Frequency Distribution Loss (FDL) for computing distribution distance within the frequency domain.
Our method is empirically proven effective as a training constraint due to the thoughtful utilization of global information in the frequency domain.
arXiv Detail & Related papers (2024-02-28T09:27:41Z) - Inpainting Normal Maps for Lightstage data [3.1002416427168304]
This study introduces a novel method for inpainting normal maps using a generative adversarial network (GAN)
Our approach extends previous general image inpainting techniques, employing a bow tie-like generator network and a discriminator network, with alternating training phases.
Our findings suggest that the proposed model effectively generates high-quality, realistic inpainted normal maps, suitable for performance capture applications.
arXiv Detail & Related papers (2024-01-16T03:59:07Z) - DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image
Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments.
Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features.
Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z) - PRISTA-Net: Deep Iterative Shrinkage Thresholding Network for Coded
Diffraction Patterns Phase Retrieval [6.982256124089]
Phase retrieval is a challenge nonlinear inverse problem in computational imaging and image processing.
We have developed PRISTA-Net, a deep unfolding network based on the first-order iterative threshold threshold algorithm (ISTA)
All parameters in the proposed PRISTA-Net framework, including the nonlinear transformation, threshold, and step size, are learned-to-end instead of being set.
arXiv Detail & Related papers (2023-09-08T07:37:15Z) - Frequency Dropout: Feature-Level Regularization via Randomized Filtering [24.53978165468098]
Deep convolutional neural networks are susceptible to picking up spurious correlations from the training signal.
We propose a training strategy, Frequency Dropout, to prevent convolutional neural networks from learning frequency-specific imaging features.
Our results suggest that the proposed approach does not only improve predictive accuracy but also improves robustness against domain shift.
arXiv Detail & Related papers (2022-09-20T16:42:21Z) - Deep Convolutional Pooling Transformer for Deepfake Detection [54.10864860009834]
We propose a deep convolutional Transformer to incorporate decisive image features both locally and globally.
Specifically, we apply convolutional pooling and re-attention to enrich the extracted features and enhance efficacy.
The proposed solution consistently outperforms several state-of-the-art baselines on both within- and cross-dataset experiments.
arXiv Detail & Related papers (2022-09-12T15:05:41Z) - Prefix Conditioning Unifies Language and Label Supervision [84.11127588805138]
We show that dataset biases negatively affect pre-training by reducing the generalizability of learned representations.
In experiments, we show that this simple technique improves the performance in zero-shot image recognition accuracy and robustness to the image-level distribution shift.
arXiv Detail & Related papers (2022-06-02T16:12:26Z) - Impact of Aliasing on Generalization in Deep Convolutional Networks [29.41652467340308]
We investigate the impact of aliasing on generalization in Deep Convolutional Networks.
We show how to mitigate aliasing by inserting non-trainable low-pass filters at key locations.
arXiv Detail & Related papers (2021-08-07T17:12:03Z) - TFill: Image Completion via a Transformer-Based Architecture [69.62228639870114]
We propose treating image completion as a directionless sequence-to-sequence prediction task.
We employ a restrictive CNN with small and non-overlapping RF for token representation.
In a second phase, to improve appearance consistency between visible and generated regions, a novel attention-aware layer (AAL) is introduced.
arXiv Detail & Related papers (2021-04-02T01:42:01Z) - Generic Perceptual Loss for Modeling Structured Output Dependencies [78.59700528239141]
We show that, what matters is the network structure instead of the trained weights.
We demonstrate that a randomly-weighted deep CNN can be used to model the structured dependencies of outputs.
arXiv Detail & Related papers (2021-03-18T23:56:07Z) - $\P$ILCRO: Making Importance Landscapes Flat Again [7.047473967702792]
This paper shows that most of the existing convolutional architectures define, at initialisation, a specific feature importance landscape.
We derive the P-objective, or PILCRO for Pixel-wise Landscape Curvature Regularised Objective.
We show that P-regularised versions of popular computer vision networks have a flat importance landscape, train faster, result in a better accuracy and are more robust to noise at test time.
arXiv Detail & Related papers (2020-01-27T11:20:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.