Related papers: HoughToRadon Transform: New Neural Network Layer for Features Improvement in Projection Space

HoughToRadon Transform: New Neural Network Layer for Features Improvement in Projection Space

URL: http://arxiv.org/abs/2402.02946v1
Date: Mon, 5 Feb 2024 12:19:16 GMT
Title: HoughToRadon Transform: New Neural Network Layer for Features Improvement in Projection Space
Authors: Alexandra Zhabitskaya, Alexander Sheshkus, and Vladimir L. Arlazarov
Abstract summary: HoughToRadon Transform layer is a novel layer designed to improve the speed of neural networks incorporated with Hough Transform. Our experiments on the open MIDV-500 dataset show that this new approach leads to time savings and achieves state-of-the-art 97.7% accuracy.
Score: 83.88591755871734
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we introduce HoughToRadon Transform layer, a novel layer designed to improve the speed of neural networks incorporated with Hough Transform to solve semantic image segmentation problems. By placing it after a Hough Transform layer, "inner" convolutions receive modified feature maps with new beneficial properties, such as a smaller area of processed images and parameter space linearity by angle and shift. These properties were not presented in Hough Transform alone. Furthermore, HoughToRadon Transform layer allows us to adjust the size of intermediate feature maps using two new parameters, thus allowing us to balance the speed and quality of the resulting neural network. Our experiments on the open MIDV-500 dataset show that this new approach leads to time savings in document segmentation tasks and achieves state-of-the-art 97.7% accuracy, outperforming HoughEncoder with larger computational complexity.

Related papers

Exploring Kernel Transformations for Implicit Neural Representations [57.2225355625268]
Implicit neural representations (INRs) leverage neural networks to represent signals by mapping coordinates to their corresponding attributes. This work pioneers the exploration of the effect of kernel transformation of input/output while keeping the model itself unchanged. A byproduct of our findings is a simple yet effective method that combines scale and shift to significantly boost INR with negligible overhead.
arXiv Detail & Related papers (2025-04-07T04:43:50Z)
Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval. A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed. The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z)
GeoPos: A Minimal Positional Encoding for Enhanced Fine-Grained Details in Image Synthesis Using Convolutional Neural Networks [0.0]
The enduring inability of image generative models to recreate intricate geometric features has been an ongoing problem for nearly a decade. In this paper, we demonstrate how this problem can be mitigated by augmenting convolution layers geometric capabilities. We show this drastically improves quality of images generated by Diffusion Models, GANs, and Variational AutoEncoders (VAE)
arXiv Detail & Related papers (2024-01-03T19:27:20Z)
Sub-token ViT Embedding via Stochastic Resonance Transformers [51.12001699637727]
Vision Transformer (ViT) architectures represent images as collections of high-dimensional vectorized tokens, each corresponding to a rectangular non-overlapping patch. We propose a training-free method inspired by "stochastic resonance" The resulting "Stochastic Resonance Transformer" (SRT) retains the rich semantic information of the original representation, but grounds it on a finer-scale spatial domain, partly mitigating the coarse effect of spatial tokenization.
arXiv Detail & Related papers (2023-10-06T01:53:27Z)
Multichannel Orthogonal Transform-Based Perceptron Layers for Efficient ResNets [2.829818195105779]
We propose a set of transform-based neural network layers as an alternative to the $3times3$ Conv2D layers in CNNs. The proposed layers can be implemented based on transforms such as the Discrete Cosine Transform (DCT), Hadamard transform (HT), and biorthogonal Block Wavelet Transform (BWT)
arXiv Detail & Related papers (2023-03-13T01:07:32Z)
Reparameterization through Spatial Gradient Scaling [69.27487006953852]
Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training. We present a novel spatial gradient scaling method to redistribute learning focus among weights in convolutional networks.
arXiv Detail & Related papers (2023-03-05T17:57:33Z)
Feature-level augmentation to improve robustness of deep neural networks to affine transformations [22.323625542814284]
Recent studies revealed that convolutional neural networks do not generalize well to small image transformations. We propose to introduce data augmentation at intermediate layers of the neural architecture. We develop the capacity of the neural network to cope with such transformations.
arXiv Detail & Related papers (2022-02-10T17:14:58Z)
Visual Saliency Transformer [127.33678448761599]
We develop a novel unified model based on a pure transformer, Visual Saliency Transformer (VST), for both RGB and RGB-D salient object detection (SOD) It takes image patches as inputs and leverages the transformer to propagate global contexts among image patches. Experimental results show that our model outperforms existing state-of-the-art results on both RGB and RGB-D SOD benchmark datasets.
arXiv Detail & Related papers (2021-04-25T08:24:06Z)
Progressive Encoding for Neural Optimization [92.55503085245304]
We show the competence of the PPE layer for mesh transfer and its advantages compared to contemporary surface mapping techniques. Most importantly, our technique is a parameterization-free method, and thus applicable to a variety of target shape representations.
arXiv Detail & Related papers (2021-04-19T08:22:55Z)
Convolutional Hough Matching Networks [39.524998833064956]
We introduce a Hough transform perspective on convolutional matching and propose an effective geometric matching algorithm, dubbed Convolutional Hough Matching (CHM) We cast it into a trainable neural layer with a semi-isotropic high-dimensional kernel, which learns non-rigid matching with a small number of interpretable parameters. Our method sets a new state of the art on standard benchmarks for semantic visual correspondence, proving its strong robustness to challenging intra-class variations.
arXiv Detail & Related papers (2021-03-31T06:17:03Z)
Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling [79.15521784128102]
We introduce a novel neural network for building image generators (decoders) and apply it to variational autoencoders (VAEs) In our spatial dependency networks (SDNs), feature maps at each level of a deep neural net are computed in a spatially coherent way. We show that augmenting the decoder of a hierarchical VAE by spatial dependency layers considerably improves density estimation.
arXiv Detail & Related papers (2021-03-16T07:01:08Z)
Vanishing Point Detection with Direct and Transposed Fast Hough Transform inside the neural network [0.0]
In this paper, we suggest a new neural network architecture for vanishing point detection in images. The key element is the use of the direct and transposed Fast Hough Transforms separated by convolutional layer blocks with standard activation functions.
arXiv Detail & Related papers (2020-02-04T09:10:45Z)
The problems with using STNs to align CNN feature maps [0.0]
We argue that spatial transformer networks (STNs) do not have the ability to align the feature maps of a transformed image and its original. We advocate taking advantage of more complex features in deeper layers by instead sharing parameters between the classification and the localisation network.
arXiv Detail & Related papers (2020-01-14T12:59:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.