PreCM: The Padding-based Rotation Equivariant Convolution Mode for Semantic Segmentation
- URL: http://arxiv.org/abs/2411.01624v1
- Date: Sun, 03 Nov 2024 16:26:55 GMT
- Title: PreCM: The Padding-based Rotation Equivariant Convolution Mode for Semantic Segmentation
- Authors: Xinyu Xu, Huazhen Liu, Huilin Xiong, Wenxian Yu, Tao Zhang,
- Abstract summary: In this paper, we numerically construct the padding-based rotation equivariant convolution mode (PreCM)
PreCM can be used not only for multi-scale images and convolution kernels, but also as a replacement component to replace multiple convolutions.
Experiments show that PreCM-based networks can achieve better segmentation performance than the original and data augmentation-based networks.
- Score: 10.74841255987162
- License:
- Abstract: Semantic segmentation is an important branch of image processing and computer vision. With the popularity of deep learning, various deep semantic segmentation networks have been proposed for pixel-level classification and segmentation tasks. However, the imaging angles are often arbitrary in real world, such as water body images in remote sensing, and capillary and polyp images in medical field, and we usually cannot obtain prior orientation information to guide these networks to extract more effective features. Additionally, learning the features of objects with multiple orientation information is also challenging, as most CNN-based semantic segmentation networks do not have rotation equivariance to resist the disturbance from orientation information. To address the same, in this paper, we first establish a universal convolution-group framework to more fully utilize the orientation information and make the networks rotation equivariant. Then, we mathematically construct the padding-based rotation equivariant convolution mode (PreCM), which can be used not only for multi-scale images and convolution kernels, but also as a replacement component to replace multiple convolutions, like dilated convolution, transposed convolution, variable stride convolution, etc. In order to verify the realization of rotation equivariance, a new evaluation metric named rotation difference (RD) is finally proposed. The experiments carried out on the datesets Satellite Images of Water Bodies, DRIVE and Floodnet show that the PreCM-based networks can achieve better segmentation performance than the original and data augmentation-based networks. In terms of the average RD value, the former is 0% and the latter two are respectively 7.0503% and 3.2606%. Last but not least, PreCM also effectively enhances the robustness of networks to rotation perturbations.
Related papers
- Achieving Rotation Invariance in Convolution Operations: Shifting from Data-Driven to Mechanism-Assured [18.910817148765176]
This paper designs a set of new convolution operations that are natually invariant to arbitrary rotations.
We compare their performance with previous rotation-invariant convolutional neural networks (RI-CNNs)
The results show that RIConvs significantly improve the accuracy of these CNN backbones, especially when the training data is limited.
arXiv Detail & Related papers (2024-04-17T12:21:57Z) - Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions.
We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training.
Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z) - Revisiting Data Augmentation for Rotational Invariance in Convolutional
Neural Networks [0.29127054707887967]
We investigate how best to include rotational invariance in a CNN for image classification.
Our experiments show that networks trained with data augmentation alone can classify rotated images nearly as well as in the normal unrotated case.
arXiv Detail & Related papers (2023-10-12T15:53:24Z) - Sorted Convolutional Network for Achieving Continuous Rotational
Invariance [56.42518353373004]
We propose a Sorting Convolution (SC) inspired by some hand-crafted features of texture images.
SC achieves continuous rotational invariance without requiring additional learnable parameters or data augmentation.
Our results demonstrate that SC achieves the best performance in the aforementioned tasks.
arXiv Detail & Related papers (2023-05-23T18:37:07Z) - Adaptive Rotated Convolution for Rotated Object Detection [96.94590550217718]
We present Adaptive Rotated Convolution (ARC) module to handle rotated object detection problem.
In our ARC module, the convolution kernels rotate adaptively to extract object features with varying orientations in different images.
The proposed approach achieves state-of-the-art performance on the DOTA dataset with 81.77% mAP.
arXiv Detail & Related papers (2023-03-14T11:53:12Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - AF$_2$: Adaptive Focus Framework for Aerial Imagery Segmentation [86.44683367028914]
Aerial imagery segmentation has some unique challenges, the most critical one among which lies in foreground-background imbalance.
We propose Adaptive Focus Framework (AF$), which adopts a hierarchical segmentation procedure and focuses on adaptively utilizing multi-scale representations.
AF$ has significantly improved the accuracy on three widely used aerial benchmarks, as fast as the mainstream method.
arXiv Detail & Related papers (2022-02-18T10:14:45Z) - Rotation Equivariant Feature Image Pyramid Network for Object Detection
in Optical Remote Sensing Imagery [39.25541709228373]
We propose the rotation equivariant feature image pyramid network (REFIPN), an image pyramid network based on rotation equivariance convolution.
The proposed pyramid network extracts features in a wide range of scales and orientations by using novel convolution filters.
The detection performance of the proposed model is validated on two commonly used aerial benchmarks.
arXiv Detail & Related papers (2021-06-02T01:33:49Z) - CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image
Segmentation [95.51455777713092]
Convolutional neural networks (CNNs) have been the de facto standard for nowadays 3D medical image segmentation.
We propose a novel framework that efficiently bridges a bf Convolutional neural network and a bf Transformer bf (CoTr) for accurate 3D medical image segmentation.
arXiv Detail & Related papers (2021-03-04T13:34:22Z) - Rotation Invariant Aerial Image Retrieval with Group Convolutional
Metric Learning [21.89786914625517]
We introduce a novel method for retrieving aerial images by merging group convolution with attention mechanism and metric learning.
Results show that the proposed method performance exceeds other state-of-the-art retrieval methods in both rotated and original environments.
arXiv Detail & Related papers (2020-10-19T04:12:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.