Scaling Up 3D Kernels with Bayesian Frequency Re-parameterization for
Medical Image Segmentation
- URL: http://arxiv.org/abs/2303.05785v2
- Date: Tue, 6 Jun 2023 03:05:07 GMT
- Title: Scaling Up 3D Kernels with Bayesian Frequency Re-parameterization for
Medical Image Segmentation
- Authors: Ho Hin Lee, Quan Liu, Shunxing Bao, Qi Yang, Xin Yu, Leon Y. Cai,
Thomas Li, Yuankai Huo, Xenofon Koutsoukos, Bennett A. Landman
- Abstract summary: RepUX-Net is a pure CNN architecture with a simple large kernel block design.
Inspired by the spatial frequency in the human visual system, we extend to vary the kernel convergence into element-wise setting.
- Score: 25.62587471067468
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the inspiration of vision transformers, the concept of depth-wise
convolution revisits to provide a large Effective Receptive Field (ERF) using
Large Kernel (LK) sizes for medical image segmentation. However, the
segmentation performance might be saturated and even degraded as the kernel
sizes scaled up (e.g., $21\times 21\times 21$) in a Convolutional Neural
Network (CNN). We hypothesize that convolution with LK sizes is limited to
maintain an optimal convergence for locality learning. While Structural
Re-parameterization (SR) enhances the local convergence with small kernels in
parallel, optimal small kernel branches may hinder the computational efficiency
for training. In this work, we propose RepUX-Net, a pure CNN architecture with
a simple large kernel block design, which competes favorably with current
network state-of-the-art (SOTA) (e.g., 3D UX-Net, SwinUNETR) using 6
challenging public datasets. We derive an equivalency between kernel
re-parameterization and the branch-wise variation in kernel convergence.
Inspired by the spatial frequency in the human visual system, we extend to vary
the kernel convergence into element-wise setting and model the spatial
frequency as a Bayesian prior to re-parameterize convolutional weights during
training. Specifically, a reciprocal function is leveraged to estimate a
frequency-weighted value, which rescales the corresponding kernel element for
stochastic gradient descent. From the experimental results, RepUX-Net
consistently outperforms 3D SOTA benchmarks with internal validation (FLARE:
0.929 to 0.944), external validation (MSD: 0.901 to 0.932, KiTS: 0.815 to
0.847, LiTS: 0.933 to 0.949, TCIA: 0.736 to 0.779) and transfer learning (AMOS:
0.880 to 0.911) scenarios in Dice Score.
Related papers
- KernelWarehouse: Rethinking the Design of Dynamic Convolution [16.101179962553385]
KernelWarehouse redefines the basic concepts of Kernels", assembling kernels" and attention function"
We testify the effectiveness of KernelWarehouse on ImageNet and MS-COCO datasets using various ConvNet architectures.
arXiv Detail & Related papers (2024-06-12T05:16:26Z) - DeformUX-Net: Exploring a 3D Foundation Backbone for Medical Image
Segmentation with Depthwise Deformable Convolution [26.746489317083352]
We introduce 3D DeformUX-Net, a pioneering volumetric CNN model.
We revisit volumetric deformable convolution in depth-wise setting to adapt long-range dependency with computational efficiency.
Our empirical evaluations reveal that the 3D DeformUX-Net consistently outperforms existing state-of-the-art ViTs and large kernel convolution models.
arXiv Detail & Related papers (2023-09-30T00:33:41Z) - KernelWarehouse: Towards Parameter-Efficient Dynamic Convolution [19.021411176761738]
Dynamic convolution learns a linear mixture of $n$ static kernels weighted with their sample-dependent attentions.
Existing designs are parameter-inefficient: they increase the number of convolutional parameters by $n$ times.
We propose $ KernelWarehouse, which can strike a favorable trade-off between parameter efficiency and representation power.
arXiv Detail & Related papers (2023-08-16T13:35:09Z) - GMConv: Modulating Effective Receptive Fields for Convolutional Kernels [52.50351140755224]
In convolutional neural networks, the convolutions are performed using a square kernel with a fixed N $times$ N receptive field (RF)
Inspired by the property that ERFs typically exhibit a Gaussian distribution, we propose a Gaussian Mask convolutional kernel (GMConv) in this work.
Our GMConv can directly replace the standard convolutions in existing CNNs and can be easily trained end-to-end by standard back-propagation.
arXiv Detail & Related papers (2023-02-09T10:17:17Z) - UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation [93.88170217725805]
We propose a 3D medical image segmentation approach, named UNETR++, that offers both high-quality segmentation masks as well as efficiency in terms of parameters, compute cost, and inference speed.
The core of our design is the introduction of a novel efficient paired attention (EPA) block that efficiently learns spatial and channel-wise discriminative features.
Our evaluations on five benchmarks, Synapse, BTCV, ACDC, BRaTs, and Decathlon-Lung, reveal the effectiveness of our contributions in terms of both efficiency and accuracy.
arXiv Detail & Related papers (2022-12-08T18:59:57Z) - Efficient Dataset Distillation Using Random Feature Approximation [109.07737733329019]
We propose a novel algorithm that uses a random feature approximation (RFA) of the Neural Network Gaussian Process (NNGP) kernel.
Our algorithm provides at least a 100-fold speedup over KIP and can run on a single GPU.
Our new method, termed an RFA Distillation (RFAD), performs competitively with KIP and other dataset condensation algorithms in accuracy over a range of large-scale datasets.
arXiv Detail & Related papers (2022-10-21T15:56:13Z) - EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for
Mobile Vision Applications [68.35683849098105]
We introduce split depth-wise transpose attention (SDTA) encoder that splits input tensors into multiple channel groups.
Our EdgeNeXt model with 1.3M parameters achieves 71.2% top-1 accuracy on ImageNet-1K.
Our EdgeNeXt model with 5.6M parameters achieves 79.4% top-1 accuracy on ImageNet-1K.
arXiv Detail & Related papers (2022-06-21T17:59:56Z) - Efficient Context-Aware Network for Abdominal Multi-organ Segmentation [8.92337236455273]
We develop a whole-based coarse-to-fine framework for efficient and effective abdominal multi-organ segmentation.
For the decoder module, anisotropic convolution with a k*k*1 intra-slice convolution and a 1*1*k inter-slice convolution is designed to reduce the burden.
For the context block, we propose strip pooling module to capture anisotropic and long-range contextual information.
arXiv Detail & Related papers (2021-09-22T09:05:59Z) - nnFormer: Interleaved Transformer for Volumetric Segmentation [50.10441845967601]
We introduce nnFormer, a powerful segmentation model with an interleaved architecture based on empirical combination of self-attention and convolution.
nnFormer achieves tremendous improvements over previous transformer-based methods on two commonly used datasets Synapse and ACDC.
arXiv Detail & Related papers (2021-09-07T17:08:24Z) - Integrating Circle Kernels into Convolutional Neural Networks [30.950819638148104]
The square kernel is a standard unit for contemporary Convolutional Neural Networks (CNNs)
We propose using circle kernels with isotropic receptive fields for the convolution.
Our training takes approximately equivalent amount of calculation when compared with the corresponding CNN with square kernels.
arXiv Detail & Related papers (2021-07-06T07:59:36Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.