Group Shift Pointwise Convolution for Volumetric Medical Image
Segmentation
- URL: http://arxiv.org/abs/2109.12629v1
- Date: Sun, 26 Sep 2021 15:27:33 GMT
- Title: Group Shift Pointwise Convolution for Volumetric Medical Image
Segmentation
- Authors: Junjun He, Jin Ye, Cheng Li, Diping Song, Wanli Chen, Shanshan Wang,
Lixu Gu, and Yu Qiao
- Abstract summary: We introduce a novel Group Shift Pointwise Convolution (GSP-Conv) to improve the effectiveness and efficiency of 3D convolutions.
GSP-Conv simplifies 3D convolutions into pointwise ones with 1x1x1 kernels, which dramatically reduces the number of model parameters and FLOPs.
Results show that our method, with substantially decreased model complexity, achieves comparable or even better performance than models employing 3D convolutions.
- Score: 31.72090839643412
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies have witnessed the effectiveness of 3D convolutions on
segmenting volumetric medical images. Compared with the 2D counterparts, 3D
convolutions can capture the spatial context in three dimensions. Nevertheless,
models employing 3D convolutions introduce more trainable parameters and are
more computationally complex, which may lead easily to model overfitting
especially for medical applications with limited available training data. This
paper aims to improve the effectiveness and efficiency of 3D convolutions by
introducing a novel Group Shift Pointwise Convolution (GSP-Conv). GSP-Conv
simplifies 3D convolutions into pointwise ones with 1x1x1 kernels, which
dramatically reduces the number of model parameters and FLOPs (e.g. 27x fewer
than 3D convolutions with 3x3x3 kernels). Na\"ive pointwise convolutions with
limited receptive fields cannot make full use of the spatial image context. To
address this problem, we propose a parameter-free operation, Group Shift (GS),
which shifts the feature maps along with different spatial directions in an
elegant way. With GS, pointwise convolutions can access features from different
spatial locations, and the limited receptive fields of pointwise convolutions
can be compensated. We evaluate the proposed methods on two datasets, PROMISE12
and BraTS18. Results show that our method, with substantially decreased model
complexity, achieves comparable or even better performance than models
employing 3D convolutions.
Related papers
- Neural Signed Distance Function Inference through Splatting 3D Gaussians Pulled on Zero-Level Set [49.780302894956776]
It is vital to infer a signed distance function (SDF) in multi-view based surface reconstruction.
We propose a method that seamlessly merge 3DGS with the learning of neural SDFs.
Our numerical and visual comparisons show our superiority over the state-of-the-art results on the widely used benchmarks.
arXiv Detail & Related papers (2024-10-18T05:48:06Z) - Spatiotemporal Modeling Encounters 3D Medical Image Analysis:
Slice-Shift UNet with Multi-View Fusion [0.0]
We propose a new 2D-based model dubbed Slice SHift UNet which encodes three-dimensional features at 2D CNN's complexity.
More precisely multi-view features are collaboratively learned by performing 2D convolutions along the three planes of a volume.
The effectiveness of our approach is validated in Multi-Modality Abdominal Multi-Organ axis (AMOS) and Multi-Atlas Labeling Beyond the Cranial Vault (BTCV) datasets.
arXiv Detail & Related papers (2023-07-24T14:53:23Z) - Deformably-Scaled Transposed Convolution [17.4596321623511]
We revisit transposed convolution and introduce a novel layer that allows us to place information in the image selectively.
Our novel layer can be used as a drop-in replacement for 2D and 3D upsampling operators and the code will be publicly available.
arXiv Detail & Related papers (2022-10-17T21:35:29Z) - Spatial Pruned Sparse Convolution for Efficient 3D Object Detection [41.62839541489369]
3D scenes are dominated by a large number of background points, which is redundant for the detection task that mainly needs to focus on foreground objects.
In this paper, we analyze major components of existing 3D CNNs and find that 3D CNNs ignore the redundancy of data and further amplify it in the down-sampling process, which brings a huge amount of extra and unnecessary computational overhead.
We propose a new convolution operator named spatial pruned sparse convolution (SPS-Conv), which includes two variants, spatial pruned submanifold sparse convolution (SPSS-Conv) and spatial pruned regular sparse convolution (SPRS
arXiv Detail & Related papers (2022-09-28T16:19:06Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z) - Equivariant Point Network for 3D Point Cloud Analysis [17.689949017410836]
We propose an effective and practical SE(3) (3D translation and rotation) equivariant network for point cloud analysis.
First, we present SE(3) separable point convolution, a novel framework that breaks down the 6D convolution into two separable convolutional operators.
Second, we introduce an attention layer to effectively harness the expressiveness of the equivariant features.
arXiv Detail & Related papers (2021-03-25T21:57:10Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR
Segmentation [81.02742110604161]
State-of-the-art methods for large-scale driving-scene LiDAR segmentation often project the point clouds to 2D space and then process them via 2D convolution.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pat-tern.
Our method achieves the 1st place in the leaderboard of Semantic KITTI and outperforms existing methods on nuScenes with a noticeable margin, about 4%.
arXiv Detail & Related papers (2020-11-19T18:53:11Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z) - Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic
Segmentation [87.54570024320354]
State-of-the-art methods for large-scale driving-scene LiDAR semantic segmentation often project and process the point clouds in the 2D space.
A straightforward solution to tackle the issue of 3D-to-2D projection is to keep the 3D representation and process the points in the 3D space.
We develop a 3D cylinder partition and a 3D cylinder convolution based framework, termed as Cylinder3D, which exploits the 3D topology relations and structures of driving-scene point clouds.
arXiv Detail & Related papers (2020-08-04T13:56:19Z) - Spatial Information Guided Convolution for Real-Time RGBD Semantic
Segmentation [79.78416804260668]
We propose Spatial information guided Convolution (S-Conv), which allows efficient RGB feature and 3D spatial information integration.
S-Conv is competent to infer the sampling offset of the convolution kernel guided by the 3D spatial information.
We further embed S-Conv into a semantic segmentation network, called Spatial information Guided convolutional Network (SGNet)
arXiv Detail & Related papers (2020-04-09T13:38:05Z) - Anisotropic Convolutional Networks for 3D Semantic Scene Completion [24.9671648682339]
semantic scene completion (SSC) tries to simultaneously infer the occupancy and semantic labels for a scene from a single depth and/or RGB image.
We propose a novel module called anisotropic convolution, which properties with flexibility and power impossible for competing methods.
In contrast to the standard 3D convolution that is limited to a fixed 3D receptive field, our module is capable of modeling the dimensional anisotropy voxel-wisely.
arXiv Detail & Related papers (2020-04-05T07:57:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.