Related papers: Group Shift Pointwise Convolution for Volumetric Medical Image Segmentation

Group Shift Pointwise Convolution for Volumetric Medical Image Segmentation

URL: http://arxiv.org/abs/2109.12629v1
Date: Sun, 26 Sep 2021 15:27:33 GMT
Title: Group Shift Pointwise Convolution for Volumetric Medical Image Segmentation
Authors: Junjun He, Jin Ye, Cheng Li, Diping Song, Wanli Chen, Shanshan Wang, Lixu Gu, and Yu Qiao
Abstract summary: We introduce a novel Group Shift Pointwise Convolution (GSP-Conv) to improve the effectiveness and efficiency of 3D convolutions. GSP-Conv simplifies 3D convolutions into pointwise ones with 1x1x1 kernels, which dramatically reduces the number of model parameters and FLOPs. Results show that our method, with substantially decreased model complexity, achieves comparable or even better performance than models employing 3D convolutions.
Score: 31.72090839643412
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent studies have witnessed the effectiveness of 3D convolutions on segmenting volumetric medical images. Compared with the 2D counterparts, 3D convolutions can capture the spatial context in three dimensions. Nevertheless, models employing 3D convolutions introduce more trainable parameters and are more computationally complex, which may lead easily to model overfitting especially for medical applications with limited available training data. This paper aims to improve the effectiveness and efficiency of 3D convolutions by introducing a novel Group Shift Pointwise Convolution (GSP-Conv). GSP-Conv simplifies 3D convolutions into pointwise ones with 1x1x1 kernels, which dramatically reduces the number of model parameters and FLOPs (e.g. 27x fewer than 3D convolutions with 3x3x3 kernels). Na\"ive pointwise convolutions with limited receptive fields cannot make full use of the spatial image context. To address this problem, we propose a parameter-free operation, Group Shift (GS), which shifts the feature maps along with different spatial directions in an elegant way. With GS, pointwise convolutions can access features from different spatial locations, and the limited receptive fields of pointwise convolutions can be compensated. We evaluate the proposed methods on two datasets, PROMISE12 and BraTS18. Results show that our method, with substantially decreased model complexity, achieves comparable or even better performance than models employing 3D convolutions.

Related papers

Efficient Continuous Group Convolutions for Local SE(3) Equivariance in 3D Point Clouds [5.659343611352998]
We present an efficient, continuous, and local SE(3) equivariant convolution layer for point cloud processing. Our approach achieves competitive or superior performance across a range of datasets and tasks, including object classification and semantic segmentation.
arXiv Detail & Related papers (2025-02-11T12:15:56Z)
Geometric Algebra Planes: Convex Implicit Neural Volumes [70.12234371845445]
We show that GA-Planes is equivalent to a sparse low-rank factor plus low-resolution matrix. We also show that GA-Planes can be adapted for many existing representations.
arXiv Detail & Related papers (2024-11-20T18:21:58Z)
Neural Signed Distance Function Inference through Splatting 3D Gaussians Pulled on Zero-Level Set [49.780302894956776]
It is vital to infer a signed distance function (SDF) in multi-view based surface reconstruction. We propose a method that seamlessly merge 3DGS with the learning of neural SDFs. Our numerical and visual comparisons show our superiority over the state-of-the-art results on the widely used benchmarks.
arXiv Detail & Related papers (2024-10-18T05:48:06Z)
Spatiotemporal Modeling Encounters 3D Medical Image Analysis: Slice-Shift UNet with Multi-View Fusion [0.0]
We propose a new 2D-based model dubbed Slice SHift UNet which encodes three-dimensional features at 2D CNN's complexity. More precisely multi-view features are collaboratively learned by performing 2D convolutions along the three planes of a volume. The effectiveness of our approach is validated in Multi-Modality Abdominal Multi-Organ axis (AMOS) and Multi-Atlas Labeling Beyond the Cranial Vault (BTCV) datasets.
arXiv Detail & Related papers (2023-07-24T14:53:23Z)
Deformably-Scaled Transposed Convolution [17.4596321623511]
We revisit transposed convolution and introduce a novel layer that allows us to place information in the image selectively. Our novel layer can be used as a drop-in replacement for 2D and 3D upsampling operators and the code will be publicly available.
arXiv Detail & Related papers (2022-10-17T21:35:29Z)
Spatial Pruned Sparse Convolution for Efficient 3D Object Detection [41.62839541489369]
3D scenes are dominated by a large number of background points, which is redundant for the detection task that mainly needs to focus on foreground objects. In this paper, we analyze major components of existing 3D CNNs and find that 3D CNNs ignore the redundancy of data and further amplify it in the down-sampling process, which brings a huge amount of extra and unnecessary computational overhead. We propose a new convolution operator named spatial pruned sparse convolution (SPS-Conv), which includes two variants, spatial pruned submanifold sparse convolution (SPSS-Conv) and spatial pruned regular sparse convolution (SPRS
arXiv Detail & Related papers (2022-09-28T16:19:06Z)
Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution. A natural remedy is to utilize the 3D voxelization and 3D convolution network. We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z)
Equivariant Point Network for 3D Point Cloud Analysis [17.689949017410836]
We propose an effective and practical SE(3) (3D translation and rotation) equivariant network for point cloud analysis. First, we present SE(3) separable point convolution, a novel framework that breaks down the 6D convolution into two separable convolutional operators. Second, we introduce an attention layer to effectively harness the expressiveness of the equivariant features.
arXiv Detail & Related papers (2021-03-25T21:57:10Z)
Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation [81.02742110604161]
State-of-the-art methods for large-scale driving-scene LiDAR segmentation often project the point clouds to 2D space and then process them via 2D convolution. We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pat-tern. Our method achieves the 1st place in the leaderboard of Semantic KITTI and outperforms existing methods on nuScenes with a noticeable margin, about 4%.
arXiv Detail & Related papers (2020-11-19T18:53:11Z)
Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image. Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space. We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step. This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z)
Spatial Information Guided Convolution for Real-Time RGBD Semantic Segmentation [79.78416804260668]
We propose Spatial information guided Convolution (S-Conv), which allows efficient RGB feature and 3D spatial information integration. S-Conv is competent to infer the sampling offset of the convolution kernel guided by the 3D spatial information. We further embed S-Conv into a semantic segmentation network, called Spatial information Guided convolutional Network (SGNet)
arXiv Detail & Related papers (2020-04-09T13:38:05Z)
Anisotropic Convolutional Networks for 3D Semantic Scene Completion [24.9671648682339]
semantic scene completion (SSC) tries to simultaneously infer the occupancy and semantic labels for a scene from a single depth and/or RGB image. We propose a novel module called anisotropic convolution, which properties with flexibility and power impossible for competing methods. In contrast to the standard 3D convolution that is limited to a fixed 3D receptive field, our module is capable of modeling the dimensional anisotropy voxel-wisely.
arXiv Detail & Related papers (2020-04-05T07:57:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.