Localizing Semantic Patches for Accelerating Image Classification
- URL: http://arxiv.org/abs/2206.03367v1
- Date: Tue, 7 Jun 2022 15:01:54 GMT
- Title: Localizing Semantic Patches for Accelerating Image Classification
- Authors: Chuanguang Yang, Zhulin An, Yongjun Xu
- Abstract summary: We first pinpoint task-aware regions over the input image by a lightweight patch proposal network called AnchorNet.
We then feed these localized semantic patches with much smaller spatial redundancy into a general classification network.
Our method outperforms SOTA dynamic inference methods with fewer inference costs.
- Score: 12.250230630124758
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing works often focus on reducing the architecture redundancy for
accelerating image classification but ignore the spatial redundancy of the
input image. This paper proposes an efficient image classification pipeline to
solve this problem. We first pinpoint task-aware regions over the input image
by a lightweight patch proposal network called AnchorNet. We then feed these
localized semantic patches with much smaller spatial redundancy into a general
classification network. Unlike the popular design of deep CNN, we aim to
carefully design the Receptive Field of AnchorNet without intermediate
convolutional paddings. This ensures the exact mapping from a high-level
spatial location to the specific input image patch. The contribution of each
patch is interpretable. Moreover, AnchorNet is compatible with any downstream
architecture. Experimental results on ImageNet show that our method outperforms
SOTA dynamic inference methods with fewer inference costs. Our code is
available at https://github.com/winycg/AnchorNet.
Related papers
- Learning to Rank Patches for Unbiased Image Redundancy Reduction [80.93989115541966]
Images suffer from heavy spatial redundancy because pixels in neighboring regions are spatially correlated.
Existing approaches strive to overcome this limitation by reducing less meaningful image regions.
We propose a self-supervised framework for image redundancy reduction called Learning to Rank Patches.
arXiv Detail & Related papers (2024-03-31T13:12:41Z) - PadChannel: Improving CNN Performance through Explicit Padding Encoding [40.39759037668144]
In convolutional neural networks (CNNs), padding plays a pivotal role in preserving spatial dimensions throughout the layers.
Traditional padding techniques do not explicitly distinguish between the actual image content and the padded regions.
We propose PadChannel, a novel padding method that encodes padding statuses as an additional input channel.
arXiv Detail & Related papers (2023-11-13T07:44:56Z) - PATS: Patch Area Transportation with Subdivision for Local Feature
Matching [78.67559513308787]
Local feature matching aims at establishing sparse correspondences between a pair of images.
We propose Patch Area Transportation with Subdivision (PATS) to tackle this issue.
PATS improves both matching accuracy and coverage, and shows superior performance in downstream tasks.
arXiv Detail & Related papers (2023-03-14T08:28:36Z) - Accurate Image Restoration with Attention Retractable Transformer [50.05204240159985]
We propose Attention Retractable Transformer (ART) for image restoration.
ART presents both dense and sparse attention modules in the network.
We conduct extensive experiments on image super-resolution, denoising, and JPEG compression artifact reduction tasks.
arXiv Detail & Related papers (2022-10-04T07:35:01Z) - Patch-Based Stochastic Attention for Image Editing [4.8201607588546]
We propose an efficient attention layer based on the algorithm PatchMatch, which is used for determining approximate nearest neighbors.
We demonstrate the usefulness of PSAL on several image editing tasks, such as image inpainting, guided image colorization, and single-image super-resolution.
arXiv Detail & Related papers (2022-02-07T13:42:00Z) - Global and Local Alignment Networks for Unpaired Image-to-Image
Translation [170.08142745705575]
The goal of unpaired image-to-image translation is to produce an output image reflecting the target domain's style.
Due to the lack of attention to the content change in existing methods, semantic information from source images suffers from degradation during translation.
We introduce a novel approach, Global and Local Alignment Networks (GLA-Net)
Our method effectively generates sharper and more realistic images than existing approaches.
arXiv Detail & Related papers (2021-11-19T18:01:54Z) - Context-aware Padding for Semantic Segmentation [82.37483350347559]
We propose a context-aware (CA) padding approach to extend the image.
Using context-aware padding, the ResNet-based segmentation model achieves higher mean Intersection-Over-Union than the traditional zero padding.
arXiv Detail & Related papers (2021-09-16T10:33:21Z) - An Empirical Method to Quantify the Peripheral Performance Degradation
in Deep Networks [18.808132632482103]
convolutional neural network (CNN) kernels compound with each convolutional layer.
Deeper and deeper networks combined with stride-based down-sampling means that the propagation of this region can end up covering a non-negligable portion of the image.
Our dataset is constructed by inserting objects into high resolution backgrounds, thereby allowing us to crop sub-images which place target objects at specific locations relative to the image border.
By probing the behaviour of Mask R-CNN across a selection of target locations, we see clear patterns of performance degredation near the image boundary, and in particular in the image corners.
arXiv Detail & Related papers (2020-12-04T18:00:47Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z) - Localizing Interpretable Multi-scale informative Patches Derived from
Media Classification Task [12.447143226347922]
We construct an interpretable AnchorNet equipped with our carefully designed RFs and linearly spatial aggregation.
We show that localized patches can indeed retain the most semantics and evidences of the original inputs.
arXiv Detail & Related papers (2020-01-31T10:04:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.