Related papers: Learning to Learn Parameterized Classification Networks for Scalable Input Images

Learning to Learn Parameterized Classification Networks for Scalable Input Images

URL: http://arxiv.org/abs/2007.06181v1
Date: Mon, 13 Jul 2020 04:27:25 GMT
Title: Learning to Learn Parameterized Classification Networks for Scalable Input Images
Authors: Duo Li, Anbang Yao and Qifeng Chen
Abstract summary: Convolutional Neural Networks (CNNs) do not have a predictable recognition behavior with respect to the input resolution change. We employ meta learners to generate convolutional weights of main networks for various input scales. We further utilize knowledge distillation on the fly over model predictions based on different input resolutions.
Score: 76.44375136492827
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Convolutional Neural Networks (CNNs) do not have a predictable recognition behavior with respect to the input resolution change. This prevents the feasibility of deployment on different input image resolutions for a specific model. To achieve efficient and flexible image classification at runtime, we employ meta learners to generate convolutional weights of main networks for various input scales and maintain privatized Batch Normalization layers per scale. For improved training performance, we further utilize knowledge distillation on the fly over model predictions based on different input resolutions. The learned meta network could dynamically parameterize main networks to act on input images of arbitrary size with consistently better accuracy compared to individually trained models. Extensive experiments on the ImageNet demonstrate that our method achieves an improved accuracy-efficiency trade-off during the adaptive inference process. By switching executable input resolutions, our method could satisfy the requirement of fast adaption in different resource-constrained environments. Code and models are available at https://github.com/d-li14/SAN.

Related papers

Nearly Lossless Adaptive Bit Switching [8.485009775430411]
Experimental results on the ImageNet-1K classification demonstrate that our methods have enough advantages to state-of-the-art one-shot joint QAT in both multi-precision and mixed-precision.
arXiv Detail & Related papers (2025-02-03T09:46:26Z)
Elastic-DETR: Making Image Resolution Learnable with Content-Specific Network Prediction [0.612477318852572]
We introduce a novel strategy for learnable resolution, called Elastic-DETR, enabling elastic utilization of multiple image resolutions. Our network provides an adaptive scale factor based on the content of the image with a compact scale prediction module. By leveraging the resolution's flexibility, we can demonstrate various models that exhibit varying trade-offs between accuracy and computational complexity.
arXiv Detail & Related papers (2024-12-09T09:46:21Z)
Scale Attention for Learning Deep Face Representation: A Study Against Visual Scale Variation [69.45176408639483]
We reform the conv layer by resorting to the scale-space theory. We build a novel style named SCale AttentioN Conv Neural Network (textbfSCAN-CNN) As a single-shot scheme, the inference is more efficient than multi-shot fusion.
arXiv Detail & Related papers (2022-09-19T06:35:04Z)
Neural Data-Dependent Transform for Learned Image Compression [72.86505042102155]
We build a neural data-dependent transform and introduce a continuous online mode decision mechanism to jointly optimize the coding efficiency for each individual image. The experimental results show the effectiveness of the proposed neural-syntax design and the continuous online mode decision mechanism.
arXiv Detail & Related papers (2022-03-09T14:56:48Z)
Correlation between image quality metrics of magnetic resonance images and the neural network segmentation accuracy [0.0]
In this study, we investigated the correlation between the image quality metrics of MR images with the neural network segmentation accuracy. The difference in the segmentation accuracy between models based on the random training inputs with IQM based training inputs shed light on the role of image quality metrics on segmentation accuracy.
arXiv Detail & Related papers (2021-11-01T17:02:34Z)
Differentiable Patch Selection for Image Recognition [37.11810982945019]
We propose a differentiable Top-K operator to select the most relevant parts of the input to process high resolution images. We show results for traffic sign recognition, inter-patch relationship reasoning, and fine-grained recognition without using object/part bounding box annotations.
arXiv Detail & Related papers (2021-04-07T11:15:51Z)
Retrieval Augmentation to Improve Robustness and Interpretability of Deep Neural Networks [3.0410237490041805]
In this work, we actively exploit the training data to improve the robustness and interpretability of deep neural networks. Specifically, the proposed approach uses the target of the nearest input example to initialize the memory state of an LSTM model or to guide attention mechanisms. Results show the effectiveness of the proposed models for the two tasks, on the widely used Flickr8 and IMDB datasets.
arXiv Detail & Related papers (2021-02-25T17:38:31Z)
Encoding Robustness to Image Style via Adversarial Feature Perturbations [72.81911076841408]
We adapt adversarial training by directly perturbing feature statistics, rather than image pixels, to produce robust models. Our proposed method, Adversarial Batch Normalization (AdvBN), is a single network layer that generates worst-case feature perturbations during training.
arXiv Detail & Related papers (2020-09-18T17:52:34Z)
Resolution Switchable Networks for Runtime Efficient Image Recognition [46.09537029831355]
We propose a general method to train a single convolutional neural network which is capable of switching image resolutions at inference. Networks trained with the proposed method are named Resolution Switchable Networks (RS-Nets)
arXiv Detail & Related papers (2020-07-19T02:12:59Z)
ResNeSt: Split-Attention Networks [86.25490825631763]
We present a modularized architecture, which applies the channel-wise attention on different network branches to leverage their success in capturing cross-feature interactions and learning diverse representations. Our model, named ResNeSt, outperforms EfficientNet in accuracy and latency trade-off on image classification.
arXiv Detail & Related papers (2020-04-19T20:40:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.