Learnable Polyphase Sampling for Shift Invariant and Equivariant
Convolutional Networks
- URL: http://arxiv.org/abs/2210.08001v1
- Date: Fri, 14 Oct 2022 17:59:55 GMT
- Title: Learnable Polyphase Sampling for Shift Invariant and Equivariant
Convolutional Networks
- Authors: Renan A. Rojas-Gomez, Teck-Yian Lim, Alexander G. Schwing, Minh N. Do,
Raymond A. Yeh
- Abstract summary: LPS can be trained end-to-end from data and generalizes existing handcrafted downsampling layers.
We evaluate LPS on image classification and semantic segmentation.
- Score: 120.78155051439076
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose learnable polyphase sampling (LPS), a pair of learnable
down/upsampling layers that enable truly shift-invariant and equivariant
convolutional networks. LPS can be trained end-to-end from data and generalizes
existing handcrafted downsampling layers. It is widely applicable as it can be
integrated into any convolutional network by replacing down/upsampling layers.
We evaluate LPS on image classification and semantic segmentation. Experiments
show that LPS is on-par with or outperforms existing methods in both
performance and shift consistency. For the first time, we achieve true
shift-equivariance on semantic segmentation (PASCAL VOC), i.e., 100% shift
consistency, outperforming baselines by an absolute 3.3%.
Related papers
- Improving Shift Invariance in Convolutional Neural Networks with Translation Invariant Polyphase Sampling [14.731788603429774]
Downsampling operators break the shift invariance of convolutional neural networks (CNNs)
We propose a learnable pooling operator called Translation Invariant Polyphase Sampling (TIPS)
TIPS results in consistent performance gains in terms of accuracy, shift consistency, and shift fidelity.
arXiv Detail & Related papers (2024-04-11T00:49:38Z) - LRP-QViT: Mixed-Precision Vision Transformer Quantization via Layer-wise
Relevance Propagation [0.0]
We introduce LRP-QViT, an explainability-based method for assigning mixed-precision bit allocations to different layers based on their importance during classification.
Our experimental findings demonstrate that both our fixed-bit and mixed-bit post-training quantization methods surpass existing models in the context of 4-bit and 6-bit quantization.
arXiv Detail & Related papers (2024-01-20T14:53:19Z) - Surgical Fine-Tuning Improves Adaptation to Distribution Shifts [114.17184775397067]
A common approach to transfer learning under distribution shift is to fine-tune the last few layers of a pre-trained model.
This paper shows that in such settings, selectively fine-tuning a subset of layers matches or outperforms commonly used fine-tuning approaches.
arXiv Detail & Related papers (2022-10-20T17:59:15Z) - GSMFlow: Generation Shifts Mitigating Flow for Generalized Zero-Shot
Learning [55.79997930181418]
Generalized Zero-Shot Learning aims to recognize images from both the seen and unseen classes by transferring semantic knowledge from seen to unseen classes.
It is a promising solution to take the advantage of generative models to hallucinate realistic unseen samples based on the knowledge learned from the seen classes.
We propose a novel flow-based generative framework that consists of multiple conditional affine coupling layers for learning unseen data generation.
arXiv Detail & Related papers (2022-07-05T04:04:37Z) - Mitigating Generation Shifts for Generalized Zero-Shot Learning [52.98182124310114]
Generalized Zero-Shot Learning (GZSL) is the task of leveraging semantic information (e.g., attributes) to recognize the seen and unseen samples, where unseen classes are not observable during training.
We propose a novel Generation Shifts Mitigating Flow framework for learning unseen data synthesis efficiently and effectively.
Experimental results demonstrate that GSMFlow achieves state-of-the-art recognition performance in both conventional and generalized zero-shot settings.
arXiv Detail & Related papers (2021-07-07T11:43:59Z) - Layer Reduction: Accelerating Conformer-Based Self-Supervised Model via
Layer Consistency [31.572652956170252]
Transformer-based self-supervised models are trained as feature extractors and have empowered many downstream speech tasks to achieve state-of-the-art performance.
We experimentally achieve 7.8X parameter reduction, 41.9% training speedup and 37.7% inference speedup while maintaining comparable performance with conventional BERT-like self-supervised methods.
arXiv Detail & Related papers (2021-04-08T08:21:59Z) - Truly shift-invariant convolutional neural networks [0.0]
Recent works have shown that the output of a CNN can change significantly with small shifts in input.
We propose adaptive polyphase sampling (APS), a simple sub-sampling scheme that allows convolutional neural networks to achieve 100% consistency in classification performance under shifts.
arXiv Detail & Related papers (2020-11-28T20:57:35Z) - Embedding Propagation: Smoother Manifold for Few-Shot Classification [131.81692677836202]
We propose to use embedding propagation as an unsupervised non-parametric regularizer for manifold smoothing in few-shot classification.
We empirically show that embedding propagation yields a smoother embedding manifold.
We show that embedding propagation consistently improves the accuracy of the models in multiple semi-supervised learning scenarios by up to 16% points.
arXiv Detail & Related papers (2020-03-09T13:51:09Z) - Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference [82.96877371742532]
We propose a Dynamic Fractional Skipping (DFS) framework for deep networks.
DFS hypothesizes layer-wise quantization (to different bitwidths) as intermediate "soft" choices to be made between fully utilizing and skipping a layer.
It exploits a layer's expressive power during input-adaptive inference, enabling finer-grained accuracy-computational cost trade-offs.
arXiv Detail & Related papers (2020-01-03T03:12:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.