Related papers: PreCM: The Padding-based Rotation Equivariant Convolution Mode for Semantic Segmentation

PreCM: The Padding-based Rotation Equivariant Convolution Mode for Semantic Segmentation

URL: http://arxiv.org/abs/2411.01624v1
Date: Sun, 03 Nov 2024 16:26:55 GMT
Title: PreCM: The Padding-based Rotation Equivariant Convolution Mode for Semantic Segmentation
Authors: Xinyu Xu, Huazhen Liu, Huilin Xiong, Wenxian Yu, Tao Zhang,
Abstract summary: In this paper, we numerically construct the padding-based rotation equivariant convolution mode (PreCM) PreCM can be used not only for multi-scale images and convolution kernels, but also as a replacement component to replace multiple convolutions. Experiments show that PreCM-based networks can achieve better segmentation performance than the original and data augmentation-based networks.
Score: 10.74841255987162
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Semantic segmentation is an important branch of image processing and computer vision. With the popularity of deep learning, various deep semantic segmentation networks have been proposed for pixel-level classification and segmentation tasks. However, the imaging angles are often arbitrary in real world, such as water body images in remote sensing, and capillary and polyp images in medical field, and we usually cannot obtain prior orientation information to guide these networks to extract more effective features. Additionally, learning the features of objects with multiple orientation information is also challenging, as most CNN-based semantic segmentation networks do not have rotation equivariance to resist the disturbance from orientation information. To address the same, in this paper, we first establish a universal convolution-group framework to more fully utilize the orientation information and make the networks rotation equivariant. Then, we mathematically construct the padding-based rotation equivariant convolution mode (PreCM), which can be used not only for multi-scale images and convolution kernels, but also as a replacement component to replace multiple convolutions, like dilated convolution, transposed convolution, variable stride convolution, etc. In order to verify the realization of rotation equivariance, a new evaluation metric named rotation difference (RD) is finally proposed. The experiments carried out on the datesets Satellite Images of Water Bodies, DRIVE and Floodnet show that the PreCM-based networks can achieve better segmentation performance than the original and data augmentation-based networks. In terms of the average RD value, the former is 0% and the latter two are respectively 7.0503% and 3.2606%. Last but not least, PreCM also effectively enhances the robustness of networks to rotation perturbations.

Related papers

Achieving Rotation Invariance in Convolution Operations: Shifting from Data-Driven to Mechanism-Assured [18.910817148765176]
This paper designs a set of new convolution operations that are natually invariant to arbitrary rotations. We compare their performance with previous rotation-invariant convolutional neural networks (RI-CNNs) The results show that RIConvs significantly improve the accuracy of these CNN backbones, especially when the training data is limited.
arXiv Detail & Related papers (2024-04-17T12:21:57Z)
Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions. We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training. Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z)
Revisiting Data Augmentation for Rotational Invariance in Convolutional Neural Networks [0.29127054707887967]
We investigate how best to include rotational invariance in a CNN for image classification. Our experiments show that networks trained with data augmentation alone can classify rotated images nearly as well as in the normal unrotated case.
arXiv Detail & Related papers (2023-10-12T15:53:24Z)
Sorted Convolutional Network for Achieving Continuous Rotational Invariance [56.42518353373004]
We propose a Sorting Convolution (SC) inspired by some hand-crafted features of texture images. SC achieves continuous rotational invariance without requiring additional learnable parameters or data augmentation. Our results demonstrate that SC achieves the best performance in the aforementioned tasks.
arXiv Detail & Related papers (2023-05-23T18:37:07Z)
Adaptive Rotated Convolution for Rotated Object Detection [96.94590550217718]
We present Adaptive Rotated Convolution (ARC) module to handle rotated object detection problem. In our ARC module, the convolution kernels rotate adaptively to extract object features with varying orientations in different images. The proposed approach achieves state-of-the-art performance on the DOTA dataset with 81.77% mAP.
arXiv Detail & Related papers (2023-03-14T11:53:12Z)
Moving Frame Net: SE(3)-Equivariant Network for Volumes [0.0]
A rotation and translation equivariant neural network for image data was proposed based on the moving frames approach. We significantly improve that approach by reducing the computation of moving frames to only one, at the input stage. Our trained model overperforms the benchmarks in the medical volume classification of most of the tested datasets from MedMNIST3D.
arXiv Detail & Related papers (2022-11-07T10:25:38Z)
Omni-Seg+: A Scale-aware Dynamic Network for Pathological Image Segmentation [13.182646724406291]
The cross-sectional areas of glomeruli can be 64 times larger than that of peritubular capillaries. We propose the Omni-Seg+ network, a scale-aware dynamic neural network that achieves multi-object (six tissue types) and multi-scale (5X to 40X scale) pathological image segmentation.
arXiv Detail & Related papers (2022-06-27T21:09:55Z)
Two-Stream Graph Convolutional Network for Intra-oral Scanner Image Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes. Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z)
AF$_2$: Adaptive Focus Framework for Aerial Imagery Segmentation [86.44683367028914]
Aerial imagery segmentation has some unique challenges, the most critical one among which lies in foreground-background imbalance. We propose Adaptive Focus Framework (AF$), which adopts a hierarchical segmentation procedure and focuses on adaptively utilizing multi-scale representations. AF$ has significantly improved the accuracy on three widely used aerial benchmarks, as fast as the mainstream method.
arXiv Detail & Related papers (2022-02-18T10:14:45Z)
Automatic size and pose homogenization with spatial transformer network to improve and accelerate pediatric segmentation [51.916106055115755]
We propose a new CNN architecture that is pose and scale invariant thanks to the use of Spatial Transformer Network (STN) Our architecture is composed of three sequential modules that are estimated together during training. We test the proposed method in kidney and renal tumor segmentation on abdominal pediatric CT scanners.
arXiv Detail & Related papers (2021-07-06T14:50:03Z)
Rotation Equivariant Feature Image Pyramid Network for Object Detection in Optical Remote Sensing Imagery [39.25541709228373]
We propose the rotation equivariant feature image pyramid network (REFIPN), an image pyramid network based on rotation equivariance convolution. The proposed pyramid network extracts features in a wide range of scales and orientations by using novel convolution filters. The detection performance of the proposed model is validated on two commonly used aerial benchmarks.
arXiv Detail & Related papers (2021-06-02T01:33:49Z)
FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism [49.89268018642999]
We propose a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation. The proposed method achieves state-of-the-art performance in both category- and instance-level 6D object pose estimation.
arXiv Detail & Related papers (2021-03-12T03:07:24Z)
CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation [95.51455777713092]
Convolutional neural networks (CNNs) have been the de facto standard for nowadays 3D medical image segmentation. We propose a novel framework that efficiently bridges a bf Convolutional neural network and a bf Transformer bf (CoTr) for accurate 3D medical image segmentation.
arXiv Detail & Related papers (2021-03-04T13:34:22Z)
Learning Equivariant Representations [10.745691354609738]
Convolutional neural networks (CNNs) are successful examples of this principle. We propose equivariant models for different transformations defined by groups of symmetries. These models leverage symmetries in the data to reduce sample and model complexity and improve generalization performance.
arXiv Detail & Related papers (2020-12-04T18:46:17Z)
Rotation Invariant Aerial Image Retrieval with Group Convolutional Metric Learning [21.89786914625517]
We introduce a novel method for retrieving aerial images by merging group convolution with attention mechanism and metric learning. Results show that the proposed method performance exceeds other state-of-the-art retrieval methods in both rotated and original environments.
arXiv Detail & Related papers (2020-10-19T04:12:36Z)
A Rotation-Invariant Framework for Deep Point Cloud Analysis [132.91915346157018]
We introduce a new low-level purely rotation-invariant representation to replace common 3D Cartesian coordinates as the network inputs. Also, we present a network architecture to embed these representations into features, encoding local relations between points and their neighbors, and the global shape structure. We evaluate our method on multiple point cloud analysis tasks, including shape classification, part segmentation, and shape retrieval.
arXiv Detail & Related papers (2020-03-16T14:04:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.