HoughToRadon Transform: New Neural Network Layer for Features
Improvement in Projection Space
- URL: http://arxiv.org/abs/2402.02946v1
- Date: Mon, 5 Feb 2024 12:19:16 GMT
- Title: HoughToRadon Transform: New Neural Network Layer for Features
Improvement in Projection Space
- Authors: Alexandra Zhabitskaya, Alexander Sheshkus, and Vladimir L. Arlazarov
- Abstract summary: HoughToRadon Transform layer is a novel layer designed to improve the speed of neural networks incorporated with Hough Transform.
Our experiments on the open MIDV-500 dataset show that this new approach leads to time savings and achieves state-of-the-art 97.7% accuracy.
- Score: 83.88591755871734
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we introduce HoughToRadon Transform layer, a novel layer
designed to improve the speed of neural networks incorporated with Hough
Transform to solve semantic image segmentation problems. By placing it after a
Hough Transform layer, "inner" convolutions receive modified feature maps with
new beneficial properties, such as a smaller area of processed images and
parameter space linearity by angle and shift. These properties were not
presented in Hough Transform alone. Furthermore, HoughToRadon Transform layer
allows us to adjust the size of intermediate feature maps using two new
parameters, thus allowing us to balance the speed and quality of the resulting
neural network. Our experiments on the open MIDV-500 dataset show that this new
approach leads to time savings in document segmentation tasks and achieves
state-of-the-art 97.7% accuracy, outperforming HoughEncoder with larger
computational complexity.
Related papers
- Sub-token ViT Embedding via Stochastic Resonance Transformers [51.12001699637727]
Vision Transformer (ViT) architectures represent images as collections of high-dimensional vectorized tokens, each corresponding to a rectangular non-overlapping patch.
We propose a training-free method inspired by "stochastic resonance"
The resulting "Stochastic Resonance Transformer" (SRT) retains the rich semantic information of the original representation, but grounds it on a finer-scale spatial domain, partly mitigating the coarse effect of spatial tokenization.
arXiv Detail & Related papers (2023-10-06T01:53:27Z) - Multichannel Orthogonal Transform-Based Perceptron Layers for Efficient ResNets [2.829818195105779]
We propose a set of transform-based neural network layers as an alternative to the $3times3$ Conv2D layers in CNNs.
The proposed layers can be implemented based on transforms such as the Discrete Cosine Transform (DCT), Hadamard transform (HT), and biorthogonal Block Wavelet Transform (BWT)
arXiv Detail & Related papers (2023-03-13T01:07:32Z) - Reparameterization through Spatial Gradient Scaling [69.27487006953852]
Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training.
We present a novel spatial gradient scaling method to redistribute learning focus among weights in convolutional networks.
arXiv Detail & Related papers (2023-03-05T17:57:33Z) - Feature-level augmentation to improve robustness of deep neural networks
to affine transformations [22.323625542814284]
Recent studies revealed that convolutional neural networks do not generalize well to small image transformations.
We propose to introduce data augmentation at intermediate layers of the neural architecture.
We develop the capacity of the neural network to cope with such transformations.
arXiv Detail & Related papers (2022-02-10T17:14:58Z) - Visual Saliency Transformer [127.33678448761599]
We develop a novel unified model based on a pure transformer, Visual Saliency Transformer (VST), for both RGB and RGB-D salient object detection (SOD)
It takes image patches as inputs and leverages the transformer to propagate global contexts among image patches.
Experimental results show that our model outperforms existing state-of-the-art results on both RGB and RGB-D SOD benchmark datasets.
arXiv Detail & Related papers (2021-04-25T08:24:06Z) - Progressive Encoding for Neural Optimization [92.55503085245304]
We show the competence of the PPE layer for mesh transfer and its advantages compared to contemporary surface mapping techniques.
Most importantly, our technique is a parameterization-free method, and thus applicable to a variety of target shape representations.
arXiv Detail & Related papers (2021-04-19T08:22:55Z) - Convolutional Hough Matching Networks [39.524998833064956]
We introduce a Hough transform perspective on convolutional matching and propose an effective geometric matching algorithm, dubbed Convolutional Hough Matching (CHM)
We cast it into a trainable neural layer with a semi-isotropic high-dimensional kernel, which learns non-rigid matching with a small number of interpretable parameters.
Our method sets a new state of the art on standard benchmarks for semantic visual correspondence, proving its strong robustness to challenging intra-class variations.
arXiv Detail & Related papers (2021-03-31T06:17:03Z) - Spatial Dependency Networks: Neural Layers for Improved Generative Image
Modeling [79.15521784128102]
We introduce a novel neural network for building image generators (decoders) and apply it to variational autoencoders (VAEs)
In our spatial dependency networks (SDNs), feature maps at each level of a deep neural net are computed in a spatially coherent way.
We show that augmenting the decoder of a hierarchical VAE by spatial dependency layers considerably improves density estimation.
arXiv Detail & Related papers (2021-03-16T07:01:08Z) - Vanishing Point Detection with Direct and Transposed Fast Hough
Transform inside the neural network [0.0]
In this paper, we suggest a new neural network architecture for vanishing point detection in images.
The key element is the use of the direct and transposed Fast Hough Transforms separated by convolutional layer blocks with standard activation functions.
arXiv Detail & Related papers (2020-02-04T09:10:45Z) - The problems with using STNs to align CNN feature maps [0.0]
We argue that spatial transformer networks (STNs) do not have the ability to align the feature maps of a transformed image and its original.
We advocate taking advantage of more complex features in deeper layers by instead sharing parameters between the classification and the localisation network.
arXiv Detail & Related papers (2020-01-14T12:59:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.