Affine-Transformation-Invariant Image Classification by Differentiable
Arithmetic Distribution Module
- URL: http://arxiv.org/abs/2309.00752v2
- Date: Tue, 12 Dec 2023 20:10:04 GMT
- Title: Affine-Transformation-Invariant Image Classification by Differentiable
Arithmetic Distribution Module
- Authors: Zijie Tan, Guanfang Dong, Chenqiu Zhao, Anup Basu
- Abstract summary: Convolutional Neural Networks (CNNs) have achieved promising results in image classification.
CNNs are vulnerable to affine transformations including rotation, translation, flip and shuffle.
In this work, we introduce a more robust substitute by incorporating distribution learning techniques.
- Score: 8.125023712173686
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Although Convolutional Neural Networks (CNNs) have achieved promising results
in image classification, they still are vulnerable to affine transformations
including rotation, translation, flip and shuffle. The drawback motivates us to
design a module which can alleviate the impact from different affine
transformations. Thus, in this work, we introduce a more robust substitute by
incorporating distribution learning techniques, focusing particularly on
learning the spatial distribution information of pixels in images. To rectify
the issue of non-differentiability of prior distribution learning methods that
rely on traditional histograms, we adopt the Kernel Density Estimation (KDE) to
formulate differentiable histograms. On this foundation, we present a novel
Differentiable Arithmetic Distribution Module (DADM), which is designed to
extract the intrinsic probability distributions from images. The proposed
approach is able to enhance the model's robustness to affine transformations
without sacrificing its feature extraction capabilities, thus bridging the gap
between traditional CNNs and distribution-based learning. We validate the
effectiveness of the proposed approach through ablation study and comparative
experiments with LeNet.
Related papers
- Invariant Shape Representation Learning For Image Classification [41.610264291150706]
In this paper, we introduce a novel framework that for the first time develops invariant shape representation learning (ISRL)
Our model ISRL is designed to jointly capture invariant features in latent shape spaces parameterized by deformable transformations.
By embedding the features that are invariant with regard to target variables in different environments, our model consistently offers more accurate predictions.
arXiv Detail & Related papers (2024-11-19T03:39:43Z) - PseudoNeg-MAE: Self-Supervised Point Cloud Learning using Conditional Pseudo-Negative Embeddings [55.55445978692678]
PseudoNeg-MAE is a self-supervised learning framework that enhances global feature representation of point cloud mask autoencoders.
We show that PseudoNeg-MAE achieves state-of-the-art performance on the ModelNet40 and ScanObjectNN datasets.
arXiv Detail & Related papers (2024-09-24T07:57:21Z) - SO(2) and O(2) Equivariance in Image Recognition with
Bessel-Convolutional Neural Networks [63.24965775030674]
This work presents the development of Bessel-convolutional neural networks (B-CNNs)
B-CNNs exploit a particular decomposition based on Bessel functions to modify the key operation between images and filters.
Study is carried out to assess the performances of B-CNNs compared to other methods.
arXiv Detail & Related papers (2023-04-18T18:06:35Z) - Federated Variational Inference Methods for Structured Latent Variable
Models [1.0312968200748118]
Federated learning methods enable model training across distributed data sources without data leaving their original locations.
We present a general and elegant solution based on structured variational inference, widely used in Bayesian machine learning.
We also provide a communication-efficient variant analogous to the canonical FedAvg algorithm.
arXiv Detail & Related papers (2023-02-07T08:35:04Z) - Modelling nonlinear dependencies in the latent space of inverse
scattering [1.5990720051907859]
In inverse scattering proposed by Angles and Mallat, a deep neural network is trained to invert the scattering transform applied to an image.
After such a network is trained, it can be used as a generative model given that we can sample from the distribution of principal components of scattering coefficients.
Within this paper, two such models are explored, namely a Variational AutoEncoder and a Generative Adversarial Network.
arXiv Detail & Related papers (2022-03-19T12:07:43Z) - Designing Rotationally Invariant Neural Networks from PDEs and
Variational Methods [8.660429288575367]
We investigate how diffusion and variational models achieve rotation invariance and transfer these ideas to neural networks.
We propose activation functions which couple network channels by combining information from several oriented filters.
Our findings help to translate diffusion and variational models into mathematically well-grained network architectures, and provide novel concepts for model-based CNN design.
arXiv Detail & Related papers (2021-08-31T17:34:40Z) - Weakly supervised segmentation with cross-modality equivariant
constraints [7.757293476741071]
Weakly supervised learning has emerged as an appealing alternative to alleviate the need for large labeled datasets in semantic segmentation.
We present a novel learning strategy that leverages self-supervision in a multi-modal image scenario to significantly enhance original CAMs.
Our approach outperforms relevant recent literature under the same learning conditions.
arXiv Detail & Related papers (2021-04-06T13:14:20Z) - Invariant Deep Compressible Covariance Pooling for Aerial Scene
Categorization [80.55951673479237]
We propose a novel invariant deep compressible covariance pooling (IDCCP) to solve nuisance variations in aerial scene categorization.
We conduct extensive experiments on the publicly released aerial scene image data sets and demonstrate the superiority of this method compared with state-of-the-art methods.
arXiv Detail & Related papers (2020-11-11T11:13:07Z) - Semantic Change Detection with Asymmetric Siamese Networks [71.28665116793138]
Given two aerial images, semantic change detection aims to locate the land-cover variations and identify their change types with pixel-wise boundaries.
This problem is vital in many earth vision related tasks, such as precise urban planning and natural resource management.
We present an asymmetric siamese network (ASN) to locate and identify semantic changes through feature pairs obtained from modules of widely different structures.
arXiv Detail & Related papers (2020-10-12T13:26:30Z) - Encoding Robustness to Image Style via Adversarial Feature Perturbations [72.81911076841408]
We adapt adversarial training by directly perturbing feature statistics, rather than image pixels, to produce robust models.
Our proposed method, Adversarial Batch Normalization (AdvBN), is a single network layer that generates worst-case feature perturbations during training.
arXiv Detail & Related papers (2020-09-18T17:52:34Z) - Stein Variational Inference for Discrete Distributions [70.19352762933259]
We propose a simple yet general framework that transforms discrete distributions to equivalent piecewise continuous distributions.
Our method outperforms traditional algorithms such as Gibbs sampling and discontinuous Hamiltonian Monte Carlo.
We demonstrate that our method provides a promising tool for learning ensembles of binarized neural network (BNN)
In addition, such transform can be straightforwardly employed in gradient-free kernelized Stein discrepancy to perform goodness-of-fit (GOF) test on discrete distributions.
arXiv Detail & Related papers (2020-03-01T22:45:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.