Related papers: Meta-learning of Pooling Layers for Character Recognition

Meta-learning of Pooling Layers for Character Recognition

URL: http://arxiv.org/abs/2103.09528v1
Date: Wed, 17 Mar 2021 09:25:47 GMT
Title: Meta-learning of Pooling Layers for Character Recognition
Authors: Takato Otsuzuki, Heon Song, Seiichi Uchida, Hideaki Hayashi
Abstract summary: We propose a meta-learning framework for pooling layers in convolutional neural network-based character recognition. A parameterized pooling layer is proposed in which the kernel shape and pooling operation are trainable using two parameters. We also propose a meta-learning algorithm for the parameterized pooling layer, which allows us to acquire a suitable pooling layer across multiple tasks.
Score: 3.708656266586146
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: In convolutional neural network-based character recognition, pooling layers play an important role in dimensionality reduction and deformation compensation. However, their kernel shapes and pooling operations are empirically predetermined; typically, a fixed-size square kernel shape and max pooling operation are used. In this paper, we propose a meta-learning framework for pooling layers. As part of our framework, a parameterized pooling layer is proposed in which the kernel shape and pooling operation are trainable using two parameters, thereby allowing flexible pooling of the input data. We also propose a meta-learning algorithm for the parameterized pooling layer, which allows us to acquire a suitable pooling layer across multiple tasks. In the experiment, we applied the proposed meta-learning framework to character recognition tasks. The results demonstrate that a pooling layer that is suitable across character recognition tasks was obtained via meta-learning, and the obtained pooling layer improved the performance of the model in both few-shot character recognition and noisy image recognition tasks.

Related papers

Retinal IPA: Iterative KeyPoints Alignment for Multimodal Retinal Imaging [11.70130626541926]
We propose a novel framework for learning cross-modality features to enhance matching and registration across multi-modality retinal images. Our model draws on the success of previous learning-based feature detection and description methods. It is trained in a self-supervised manner by enforcing segmentation consistency between different augmentations of the same image.
arXiv Detail & Related papers (2024-07-25T19:51:27Z)
Lacunarity Pooling Layers for Plant Image Classification using Texture Analysis [0.38366697175402226]
Pooling layers overlook important information encoded in the spatial arrangement of pixel intensity and/or feature values. We propose a novel lacunarity pooling layer that aims to capture the spatial heterogeneity of the feature maps by evaluating the variability within local windows. The lacunarity pooling layer can be seamlessly integrated into any artificial neural network architecture.
arXiv Detail & Related papers (2024-04-25T00:34:52Z)
Class Anchor Margin Loss for Content-Based Image Retrieval [97.81742911657497]
We propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimize for the L2 metric without the need of generating pairs. We evaluate the proposed objective in the context of few-shot and full-set training on the CBIR task, by using both convolutional and transformer architectures.
arXiv Detail & Related papers (2023-06-01T12:53:10Z)
Progressive Meta-Pooling Learning for Lightweight Image Classification Model [20.076610051602618]
We propose the Meta-Pooling framework to make the receptive field learnable for a lightweight network. We present a Progressive Meta-Pooling Learning (PMPL) strategy for the parameterized spatial enhancer to acquire a suitable receptive field size. The results on the ImageNet dataset demonstrate that MobileNetV2 using Meta-Pooling achieves top1 accuracy of 74.6%, which outperforms MobileNetV2 by 2.3%.
arXiv Detail & Related papers (2023-01-24T14:28:05Z)
Hierarchical Spherical CNNs with Lifting-based Adaptive Wavelets for Pooling and Unpooling [101.72318949104627]
We propose a novel framework of hierarchical convolutional neural networks (HS-CNNs) with a lifting structure to learn adaptive spherical wavelets for pooling and unpooling. LiftHS-CNN ensures a more efficient hierarchical feature learning for both image- and pixel-level tasks.
arXiv Detail & Related papers (2022-05-31T07:23:42Z)
Revisiting Pooling through the Lens of Optimal Transport [25.309212446782684]
We develop a novel and solid algorithmic pooling framework through the lens of optimal transport. We make the parameters of the UOT problem learnable, and accordingly, propose a generalized pooling layer called textitUOT-Pooling for neural networks. We test our UOT-Pooling layers in two application scenarios, including multi-instance learning (MIL) and graph embedding.
arXiv Detail & Related papers (2022-01-23T06:20:39Z)
AdaPool: Exponential Adaptive Pooling for Information-Retaining Downsampling [82.08631594071656]
Pooling layers are essential building blocks of Convolutional Neural Networks (CNNs) We propose an adaptive and exponentially weighted pooling method named adaPool. We demonstrate how adaPool improves the preservation of detail through a range of tasks including image and video classification and object detection.
arXiv Detail & Related papers (2021-11-01T08:50:37Z)
MetaGater: Fast Learning of Conditional Channel Gated Networks via Federated Meta-Learning [46.79356071007187]
We propose a holistic approach to jointly train the backbone network and the channel gating. We develop a federated meta-learning approach to jointly learn good meta-initializations for both backbone networks and gating modules.
arXiv Detail & Related papers (2020-11-25T04:26:23Z)
Dual-constrained Deep Semi-Supervised Coupled Factorization Network with Enriched Prior [80.5637175255349]
We propose a new enriched prior based Dual-constrained Deep Semi-Supervised Coupled Factorization Network, called DS2CF-Net. To ex-tract hidden deep features, DS2CF-Net is modeled as a deep-structure and geometrical structure-constrained neural network. Our network can obtain state-of-the-art performance for representation learning and clustering.
arXiv Detail & Related papers (2020-09-08T13:10:21Z)
Learning to Compose Hypercolumns for Visual Correspondence [57.93635236871264]
We introduce a novel approach to visual correspondence that dynamically composes effective features by leveraging relevant layers conditioned on the images to match. The proposed method, dubbed Dynamic Hyperpixel Flow, learns to compose hypercolumn features on the fly by selecting a small number of relevant layers from a deep convolutional neural network.
arXiv Detail & Related papers (2020-07-21T04:03:22Z)
Modality Compensation Network: Cross-Modal Adaptation for Action Recognition [77.24983234113957]
We propose a Modality Compensation Network (MCN) to explore the relationships of different modalities. Our model bridges data from source and auxiliary modalities by a modality adaptation block to achieve adaptive representation learning. Experimental results reveal that MCN outperforms state-of-the-art approaches on four widely-used action recognition benchmarks.
arXiv Detail & Related papers (2020-01-31T04:51:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.