Related papers: Scaling Up 3D Kernels with Bayesian Frequency Re-parameterization for Medical Image Segmentation

Scaling Up 3D Kernels with Bayesian Frequency Re-parameterization for Medical Image Segmentation

URL: http://arxiv.org/abs/2303.05785v2
Date: Tue, 6 Jun 2023 03:05:07 GMT
Title: Scaling Up 3D Kernels with Bayesian Frequency Re-parameterization for Medical Image Segmentation
Authors: Ho Hin Lee, Quan Liu, Shunxing Bao, Qi Yang, Xin Yu, Leon Y. Cai, Thomas Li, Yuankai Huo, Xenofon Koutsoukos, Bennett A. Landman
Abstract summary: RepUX-Net is a pure CNN architecture with a simple large kernel block design. Inspired by the spatial frequency in the human visual system, we extend to vary the kernel convergence into element-wise setting.
Score: 25.62587471067468
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With the inspiration of vision transformers, the concept of depth-wise convolution revisits to provide a large Effective Receptive Field (ERF) using Large Kernel (LK) sizes for medical image segmentation. However, the segmentation performance might be saturated and even degraded as the kernel sizes scaled up (e.g., $21\times 21\times 21$) in a Convolutional Neural Network (CNN). We hypothesize that convolution with LK sizes is limited to maintain an optimal convergence for locality learning. While Structural Re-parameterization (SR) enhances the local convergence with small kernels in parallel, optimal small kernel branches may hinder the computational efficiency for training. In this work, we propose RepUX-Net, a pure CNN architecture with a simple large kernel block design, which competes favorably with current network state-of-the-art (SOTA) (e.g., 3D UX-Net, SwinUNETR) using 6 challenging public datasets. We derive an equivalency between kernel re-parameterization and the branch-wise variation in kernel convergence. Inspired by the spatial frequency in the human visual system, we extend to vary the kernel convergence into element-wise setting and model the spatial frequency as a Bayesian prior to re-parameterize convolutional weights during training. Specifically, a reciprocal function is leveraged to estimate a frequency-weighted value, which rescales the corresponding kernel element for stochastic gradient descent. From the experimental results, RepUX-Net consistently outperforms 3D SOTA benchmarks with internal validation (FLARE: 0.929 to 0.944), external validation (MSD: 0.901 to 0.932, KiTS: 0.815 to 0.847, LiTS: 0.933 to 0.949, TCIA: 0.736 to 0.779) and transfer learning (AMOS: 0.880 to 0.911) scenarios in Dice Score.

Related papers

Rep3D: Re-parameterize Large 3D Kernels with Low-Rank Receptive Modeling for Medical Imaging [15.142146104837005]
Rep3D is a 3D convolutional framework that incorporates a learnable spatial volumetric prior into large kernel training.<n>Rep3D offers an interpretable, and scalable solution for 3D medical image analysis.
arXiv Detail & Related papers (2025-05-26T07:12:56Z)
How Learnable Grids Recover Fine Detail in Low Dimensions: A Neural Tangent Kernel Analysis of Multigrid Parametric Encodings [106.3726679697804]
We compare the two most common techniques for mitigating this spectral bias: Fourier feature encodings (FFE) and multigrid parametric encodings (MPE) MPEs are seen as the standard for low dimensional mappings, but MPEs often outperform them and learn representations with higher resolution and finer detail. We prove that MPEs improve a network's performance through the structure of their grid and not their learnable embedding.
arXiv Detail & Related papers (2025-04-18T02:18:08Z)
An Efficient Sparse Kernel Generator for O(3)-Equivariant Deep Networks [0.5737287537823071]
Rotation equivariant graph neural networks yield state-of-the-art performance on spatial deep learning tasks. Key to these models is the Clebsch-Gordon (CG) tensor product, a kernel that contracts two dense feature vectors with a highly structured sparse tensor to produce a dense output vector. We introduce a GPU sparse kernel generator for the CG tensor product that provides significant speedup over the best existing open and closed-source implementations.
arXiv Detail & Related papers (2025-01-23T08:20:47Z)
KernelWarehouse: Rethinking the Design of Dynamic Convolution [16.101179962553385]
KernelWarehouse redefines the basic concepts of Kernels", assembling kernels" and attention function" We testify the effectiveness of KernelWarehouse on ImageNet and MS-COCO datasets using various ConvNet architectures.
arXiv Detail & Related papers (2024-06-12T05:16:26Z)
DeformUX-Net: Exploring a 3D Foundation Backbone for Medical Image Segmentation with Depthwise Deformable Convolution [26.746489317083352]
We introduce 3D DeformUX-Net, a pioneering volumetric CNN model. We revisit volumetric deformable convolution in depth-wise setting to adapt long-range dependency with computational efficiency. Our empirical evaluations reveal that the 3D DeformUX-Net consistently outperforms existing state-of-the-art ViTs and large kernel convolution models.
arXiv Detail & Related papers (2023-09-30T00:33:41Z)
KernelWarehouse: Towards Parameter-Efficient Dynamic Convolution [19.021411176761738]
Dynamic convolution learns a linear mixture of $n$ static kernels weighted with their sample-dependent attentions. Existing designs are parameter-inefficient: they increase the number of convolutional parameters by $n$ times. We propose $ KernelWarehouse, which can strike a favorable trade-off between parameter efficiency and representation power.
arXiv Detail & Related papers (2023-08-16T13:35:09Z)
GMConv: Modulating Effective Receptive Fields for Convolutional Kernels [52.50351140755224]
In convolutional neural networks, the convolutions are performed using a square kernel with a fixed N $times$ N receptive field (RF) Inspired by the property that ERFs typically exhibit a Gaussian distribution, we propose a Gaussian Mask convolutional kernel (GMConv) in this work. Our GMConv can directly replace the standard convolutions in existing CNNs and can be easily trained end-to-end by standard back-propagation.
arXiv Detail & Related papers (2023-02-09T10:17:17Z)
UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation [93.88170217725805]
We propose a 3D medical image segmentation approach, named UNETR++, that offers both high-quality segmentation masks as well as efficiency in terms of parameters, compute cost, and inference speed. The core of our design is the introduction of a novel efficient paired attention (EPA) block that efficiently learns spatial and channel-wise discriminative features. Our evaluations on five benchmarks, Synapse, BTCV, ACDC, BRaTs, and Decathlon-Lung, reveal the effectiveness of our contributions in terms of both efficiency and accuracy.
arXiv Detail & Related papers (2022-12-08T18:59:57Z)
Efficient Dataset Distillation Using Random Feature Approximation [109.07737733329019]
We propose a novel algorithm that uses a random feature approximation (RFA) of the Neural Network Gaussian Process (NNGP) kernel. Our algorithm provides at least a 100-fold speedup over KIP and can run on a single GPU. Our new method, termed an RFA Distillation (RFAD), performs competitively with KIP and other dataset condensation algorithms in accuracy over a range of large-scale datasets.
arXiv Detail & Related papers (2022-10-21T15:56:13Z)
EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications [68.35683849098105]
We introduce split depth-wise transpose attention (SDTA) encoder that splits input tensors into multiple channel groups. Our EdgeNeXt model with 1.3M parameters achieves 71.2% top-1 accuracy on ImageNet-1K. Our EdgeNeXt model with 5.6M parameters achieves 79.4% top-1 accuracy on ImageNet-1K.
arXiv Detail & Related papers (2022-06-21T17:59:56Z)
Efficient Context-Aware Network for Abdominal Multi-organ Segmentation [8.92337236455273]
We develop a whole-based coarse-to-fine framework for efficient and effective abdominal multi-organ segmentation. For the decoder module, anisotropic convolution with a k*k*1 intra-slice convolution and a 1*1*k inter-slice convolution is designed to reduce the burden. For the context block, we propose strip pooling module to capture anisotropic and long-range contextual information.
arXiv Detail & Related papers (2021-09-22T09:05:59Z)
nnFormer: Interleaved Transformer for Volumetric Segmentation [50.10441845967601]
We introduce nnFormer, a powerful segmentation model with an interleaved architecture based on empirical combination of self-attention and convolution. nnFormer achieves tremendous improvements over previous transformer-based methods on two commonly used datasets Synapse and ACDC.
arXiv Detail & Related papers (2021-09-07T17:08:24Z)
Integrating Circle Kernels into Convolutional Neural Networks [30.950819638148104]
The square kernel is a standard unit for contemporary Convolutional Neural Networks (CNNs) We propose using circle kernels with isotropic receptive fields for the convolution. Our training takes approximately equivalent amount of calculation when compared with the corresponding CNN with square kernels.
arXiv Detail & Related papers (2021-07-06T07:59:36Z)
ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN. We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.