Related papers: PreCM: The Padding-based Rotation Equivariant Convolution Mode for Semantic Segmentation

PreCM: The Padding-based Rotation Equivariant Convolution Mode for Semantic Segmentation

URL: http://arxiv.org/abs/2411.01624v2
Date: Wed, 30 Apr 2025 05:13:23 GMT
Title: PreCM: The Padding-based Rotation Equivariant Convolution Mode for Semantic Segmentation
Authors: Xinyu Xu, Huazhen Liu, Tao Zhang, Huilin Xiong, Wenxian Yu,
Abstract summary: This paper introduces a universal convolution-group framework aimed at more fully utilizing orientation information.<n>We then mathematically design a padding-based rotation equivariant convolution mode (PreCM)<n>To quantitatively assess the impact of image rotation in semantic segmentation tasks, we also propose a new evaluation metric, Rotation Difference (RD)
Score: 10.748412559871621
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Semantic segmentation is an important branch of image processing and computer vision. With the popularity of deep learning, various convolutional neural networks have been proposed for pixel-level classification and segmentation tasks. In practical scenarios, however, imaging angles are often arbitrary, encompassing instances such as water body images from remote sensing and capillary and polyp images in the medical domain, where prior orientation information is typically unavailable to guide these networks to extract more effective features. In this case, learning features from objects with diverse orientation information poses a significant challenge, as the majority of CNN-based semantic segmentation networks lack rotation equivariance to resist the disturbance from orientation information. To address this challenge, this paper first constructs a universal convolution-group framework aimed at more fully utilizing orientation information and equipping the network with rotation equivariance. Subsequently, we mathematically design a padding-based rotation equivariant convolution mode (PreCM), which is not only applicable to multi-scale images and convolutional kernels but can also serve as a replacement component for various types of convolutions, such as dilated convolutions, transposed convolutions, and asymmetric convolution. To quantitatively assess the impact of image rotation in semantic segmentation tasks, we also propose a new evaluation metric, Rotation Difference (RD). The replacement experiments related to six existing semantic segmentation networks on three datasets show that, the average Intersection Over Union (IOU) of their PreCM-based versions respectively improve 6.91%, 10.63%, 4.53%, 5.93%, 7.48%, 8.33% compared to their original versions in terms of random angle rotation. And the average RD values are decreased by 3.58%, 4.56%, 3.47%, 3.66%, 3.47%, 3.43% respectively.

Related papers

Achieving Rotation Invariance in Convolution Operations: Shifting from Data-Driven to Mechanism-Assured [18.910817148765176]
This paper designs a set of new convolution operations that are natually invariant to arbitrary rotations. We compare their performance with previous rotation-invariant convolutional neural networks (RI-CNNs) The results show that RIConvs significantly improve the accuracy of these CNN backbones, especially when the training data is limited.
arXiv Detail & Related papers (2024-04-17T12:21:57Z)
Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions. We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training. Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z)
Revisiting Data Augmentation for Rotational Invariance in Convolutional Neural Networks [0.29127054707887967]
We investigate how best to include rotational invariance in a CNN for image classification. Our experiments show that networks trained with data augmentation alone can classify rotated images nearly as well as in the normal unrotated case.
arXiv Detail & Related papers (2023-10-12T15:53:24Z)
Sorted Convolutional Network for Achieving Continuous Rotational Invariance [56.42518353373004]
We propose a Sorting Convolution (SC) inspired by some hand-crafted features of texture images. SC achieves continuous rotational invariance without requiring additional learnable parameters or data augmentation. Our results demonstrate that SC achieves the best performance in the aforementioned tasks.
arXiv Detail & Related papers (2023-05-23T18:37:07Z)
Adaptive Rotated Convolution for Rotated Object Detection [96.94590550217718]
We present Adaptive Rotated Convolution (ARC) module to handle rotated object detection problem. In our ARC module, the convolution kernels rotate adaptively to extract object features with varying orientations in different images. The proposed approach achieves state-of-the-art performance on the DOTA dataset with 81.77% mAP.
arXiv Detail & Related papers (2023-03-14T11:53:12Z)
Moving Frame Net: SE(3)-Equivariant Network for Volumes [0.0]
A rotation and translation equivariant neural network for image data was proposed based on the moving frames approach. We significantly improve that approach by reducing the computation of moving frames to only one, at the input stage. Our trained model overperforms the benchmarks in the medical volume classification of most of the tested datasets from MedMNIST3D.
arXiv Detail & Related papers (2022-11-07T10:25:38Z)
Omni-Seg+: A Scale-aware Dynamic Network for Pathological Image Segmentation [13.182646724406291]
The cross-sectional areas of glomeruli can be 64 times larger than that of peritubular capillaries. We propose the Omni-Seg+ network, a scale-aware dynamic neural network that achieves multi-object (six tissue types) and multi-scale (5X to 40X scale) pathological image segmentation.
arXiv Detail & Related papers (2022-06-27T21:09:55Z)
Two-Stream Graph Convolutional Network for Intra-oral Scanner Image Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes. Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z)
AF$_2$: Adaptive Focus Framework for Aerial Imagery Segmentation [86.44683367028914]
Aerial imagery segmentation has some unique challenges, the most critical one among which lies in foreground-background imbalance. We propose Adaptive Focus Framework (AF$), which adopts a hierarchical segmentation procedure and focuses on adaptively utilizing multi-scale representations. AF$ has significantly improved the accuracy on three widely used aerial benchmarks, as fast as the mainstream method.
arXiv Detail & Related papers (2022-02-18T10:14:45Z)
Automatic size and pose homogenization with spatial transformer network to improve and accelerate pediatric segmentation [51.916106055115755]
We propose a new CNN architecture that is pose and scale invariant thanks to the use of Spatial Transformer Network (STN) Our architecture is composed of three sequential modules that are estimated together during training. We test the proposed method in kidney and renal tumor segmentation on abdominal pediatric CT scanners.
arXiv Detail & Related papers (2021-07-06T14:50:03Z)
Rotation Equivariant Feature Image Pyramid Network for Object Detection in Optical Remote Sensing Imagery [39.25541709228373]
We propose the rotation equivariant feature image pyramid network (REFIPN), an image pyramid network based on rotation equivariance convolution. The proposed pyramid network extracts features in a wide range of scales and orientations by using novel convolution filters. The detection performance of the proposed model is validated on two commonly used aerial benchmarks.
arXiv Detail & Related papers (2021-06-02T01:33:49Z)
FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism [49.89268018642999]
We propose a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation. The proposed method achieves state-of-the-art performance in both category- and instance-level 6D object pose estimation.
arXiv Detail & Related papers (2021-03-12T03:07:24Z)
CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation [95.51455777713092]
Convolutional neural networks (CNNs) have been the de facto standard for nowadays 3D medical image segmentation. We propose a novel framework that efficiently bridges a bf Convolutional neural network and a bf Transformer bf (CoTr) for accurate 3D medical image segmentation.
arXiv Detail & Related papers (2021-03-04T13:34:22Z)
Learning Equivariant Representations [10.745691354609738]
Convolutional neural networks (CNNs) are successful examples of this principle. We propose equivariant models for different transformations defined by groups of symmetries. These models leverage symmetries in the data to reduce sample and model complexity and improve generalization performance.
arXiv Detail & Related papers (2020-12-04T18:46:17Z)
Rotation Invariant Aerial Image Retrieval with Group Convolutional Metric Learning [21.89786914625517]
We introduce a novel method for retrieving aerial images by merging group convolution with attention mechanism and metric learning. Results show that the proposed method performance exceeds other state-of-the-art retrieval methods in both rotated and original environments.
arXiv Detail & Related papers (2020-10-19T04:12:36Z)
A Rotation-Invariant Framework for Deep Point Cloud Analysis [132.91915346157018]
We introduce a new low-level purely rotation-invariant representation to replace common 3D Cartesian coordinates as the network inputs. Also, we present a network architecture to embed these representations into features, encoding local relations between points and their neighbors, and the global shape structure. We evaluate our method on multiple point cloud analysis tasks, including shape classification, part segmentation, and shape retrieval.
arXiv Detail & Related papers (2020-03-16T14:04:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.