Progressive Meta-Pooling Learning for Lightweight Image Classification
Model
- URL: http://arxiv.org/abs/2301.10038v1
- Date: Tue, 24 Jan 2023 14:28:05 GMT
- Title: Progressive Meta-Pooling Learning for Lightweight Image Classification
Model
- Authors: Peijie Dong, Xin Niu, Zhiliang Tian, Lujun Li, Xiaodong Wang, Zimian
Wei, Hengyue Pan, Dongsheng Li
- Abstract summary: We propose the Meta-Pooling framework to make the receptive field learnable for a lightweight network.
We present a Progressive Meta-Pooling Learning (PMPL) strategy for the parameterized spatial enhancer to acquire a suitable receptive field size.
The results on the ImageNet dataset demonstrate that MobileNetV2 using Meta-Pooling achieves top1 accuracy of 74.6%, which outperforms MobileNetV2 by 2.3%.
- Score: 20.076610051602618
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Practical networks for edge devices adopt shallow depth and small
convolutional kernels to save memory and computational cost, which leads to a
restricted receptive field. Conventional efficient learning methods focus on
lightweight convolution designs, ignoring the role of the receptive field in
neural network design. In this paper, we propose the Meta-Pooling framework to
make the receptive field learnable for a lightweight network, which consists of
parameterized pooling-based operations. Specifically, we introduce a
parameterized spatial enhancer, which is composed of pooling operations to
provide versatile receptive fields for each layer of a lightweight model. Then,
we present a Progressive Meta-Pooling Learning (PMPL) strategy for the
parameterized spatial enhancer to acquire a suitable receptive field size. The
results on the ImageNet dataset demonstrate that MobileNetV2 using Meta-Pooling
achieves top1 accuracy of 74.6\%, which outperforms MobileNetV2 by 2.3\%.
Related papers
- Class Anchor Margin Loss for Content-Based Image Retrieval [97.81742911657497]
We propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimize for the L2 metric without the need of generating pairs.
We evaluate the proposed objective in the context of few-shot and full-set training on the CBIR task, by using both convolutional and transformer architectures.
arXiv Detail & Related papers (2023-06-01T12:53:10Z) - Efficient Context Integration through Factorized Pyramidal Learning for
Ultra-Lightweight Semantic Segmentation [1.0499611180329804]
We propose a novel Factorized Pyramidal Learning (FPL) module to aggregate rich contextual information in an efficient manner.
We decompose the spatial pyramid into two stages which enables a simple and efficient feature fusion within the module to solve the notorious checkerboard effect.
Based on the FPL module and FIR unit, we propose an ultra-lightweight real-time network, called FPLNet, which achieves state-of-the-art accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-02-23T05:34:51Z) - Lightweight Salient Object Detection in Optical Remote-Sensing Images
via Semantic Matching and Edge Alignment [61.45639694373033]
We propose a novel lightweight network for optical remote sensing images (ORSI-SOD) based on semantic matching and edge alignment, termed SeaNet.
Specifically, SeaNet includes a lightweight MobileNet-V2 for feature extraction, a dynamic semantic matching module (DSMM) for high-level features, and a portable decoder for inference.
arXiv Detail & Related papers (2023-01-07T04:33:51Z) - Pooling Revisited: Your Receptive Field is Suboptimal [35.11562214480459]
The size and shape of the receptive field determine how the network aggregates local information.
We propose a simple yet effective Dynamically Optimized Pooling operation, referred to as DynOPool.
Our experiments show that the models equipped with the proposed learnable resizing module outperform the baseline networks on multiple datasets in image classification and semantic segmentation.
arXiv Detail & Related papers (2022-05-30T17:03:40Z) - Content-aware Warping for View Synthesis [110.54435867693203]
We propose content-aware warping, which adaptively learns the weights for pixels of a relatively large neighborhood from their contextual information via a lightweight neural network.
Based on this learnable warping module, we propose a new end-to-end learning-based framework for novel view synthesis from two source views.
Experimental results on structured light field datasets with wide baselines and unstructured multi-view datasets show that the proposed method significantly outperforms state-of-the-art methods both quantitatively and visually.
arXiv Detail & Related papers (2022-01-22T11:35:05Z) - Lightweight Salient Object Detection in Optical Remote Sensing Images
via Feature Correlation [93.80710126516405]
We propose a novel lightweight ORSI-SOD solution, named CorrNet, to address these issues.
By reducing the parameters and computations of each component, CorrNet ends up having only 4.09M parameters and running with 21.09G FLOPs.
Experimental results on two public datasets demonstrate that our lightweight CorrNet achieves competitive or even better performance compared with 26 state-of-the-art methods.
arXiv Detail & Related papers (2022-01-20T08:28:01Z) - Efficient deep learning models for land cover image classification [0.29748898344267777]
This work experiments with the BigEarthNet dataset for land use land cover (LULC) image classification.
We benchmark different state-of-the-art models, including Convolution Neural Networks, Multi-Layer Perceptrons, Visual Transformers, EfficientNets and Wide Residual Networks (WRN)
Our proposed lightweight model has an order of magnitude less trainable parameters, achieves 4.5% higher averaged f-score classification accuracy for all 19 LULC classes and is trained two times faster with respect to a ResNet50 state-of-the-art model that we use as a baseline.
arXiv Detail & Related papers (2021-11-18T00:03:14Z) - Meta-learning of Pooling Layers for Character Recognition [3.708656266586146]
We propose a meta-learning framework for pooling layers in convolutional neural network-based character recognition.
A parameterized pooling layer is proposed in which the kernel shape and pooling operation are trainable using two parameters.
We also propose a meta-learning algorithm for the parameterized pooling layer, which allows us to acquire a suitable pooling layer across multiple tasks.
arXiv Detail & Related papers (2021-03-17T09:25:47Z) - Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts.
We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively.
Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively.
Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z) - Gated Path Selection Network for Semantic Segmentation [72.44994579325822]
We develop a novel network named Gated Path Selection Network (GPSNet), which aims to learn adaptive receptive fields.
In GPSNet, we first design a two-dimensional multi-scale network - SuperNet, which densely incorporates features from growing receptive fields.
To dynamically select desirable semantic context, a gate prediction module is further introduced.
arXiv Detail & Related papers (2020-01-19T12:32:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.