Related papers: No More Sliding Window: Efficient 3D Medical Image Segmentation with Differentiable Top-k Patch Sampling

No More Sliding Window: Efficient 3D Medical Image Segmentation with Differentiable Top-k Patch Sampling

URL: http://arxiv.org/abs/2501.10814v2
Date: Thu, 06 Mar 2025 11:05:23 GMT
Title: No More Sliding Window: Efficient 3D Medical Image Segmentation with Differentiable Top-k Patch Sampling
Authors: Young Seok Jeon, Hongfei Yang, Huazhu Fu, Mengling Feng,
Abstract summary: No-More-Sliding-Window (NMSW) is a novel end-to-end trainable framework for 3D segmentation.<n> NMSW employs a differentiable Top-k module to selectively sample only the most relevant patches.<n>It delivers a 9.1x faster inference on the H100 GPU and a 11.1x faster inference on the Xeon Gold CPU.
Score: 34.54360931760496
License: http://creativecommons.org/licenses/by/4.0/
Abstract: 3D models surpass 2D models in CT/MRI segmentation by effectively capturing inter-slice relationships. However, the added depth dimension substantially increases memory consumption. While patch-based training alleviates memory constraints, it significantly slows down the inference speed due to the sliding window (SW) approach. We propose No-More-Sliding-Window (NMSW), a novel end-to-end trainable framework that enhances the efficiency of generic 3D segmentation backbone during an inference step by eliminating the need for SW. NMSW employs a differentiable Top-k module to selectively sample only the most relevant patches, thereby minimizing redundant computations. When patch-level predictions are insufficient, the framework intelligently leverages coarse global predictions to refine results. Evaluated across 3 tasks using 3 segmentation backbones, NMSW achieves competitive accuracy compared to SW inference while significantly reducing computational complexity by 91% (88.0 to 8.00 TMACs). Moreover, it delivers a 9.1x faster inference on the H100 GPU (99.0 to 8.3 sec) and a 11.1x faster inference on the Xeon Gold CPU (2110 to 189 sec). NMSW is model-agnostic, further boosting efficiency when integrated with any existing efficient segmentation backbones.

Related papers

Speedy MASt3R [68.47052557089631]
MASt3R redefines image matching as a 3D task by leveraging DUSt3R and introducing a fast reciprocal matching scheme. Fast MASt3R achieves a 54% reduction in inference time (198 ms to 91 ms per image pair) without sacrificing accuracy. This advancement enables real-time 3D understanding, benefiting applications like mixed reality navigation and large-scale 3D scene reconstruction.
arXiv Detail & Related papers (2025-03-13T03:56:22Z)
Post-Training Quantization for 3D Medical Image Segmentation: A Practical Study on Real Inference Engines [13.398758600007188]
"Fake quantization", which simulates lower operations during inference, does not actually reduce model size or improve real-world speed. "Post-training quantization" (PTQ) framework successfully implements true 8-bit quantization on state-of-the-art (SOTA) 3D medical segmentation models.
arXiv Detail & Related papers (2025-01-28T23:29:40Z)
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss [59.835032408496545]
We propose a tile-based strategy that partitions the contrastive loss calculation into arbitrary small blocks. We also introduce a multi-level tiling strategy to leverage the hierarchical structure of distributed systems. Compared to SOTA memory-efficient solutions, it achieves a two-order-of-magnitude reduction in memory while maintaining comparable speed.
arXiv Detail & Related papers (2024-10-22T17:59:30Z)
Augmented Efficiency: Reducing Memory Footprint and Accelerating Inference for 3D Semantic Segmentation through Hybrid Vision [9.96433151449016]
This paper introduces a novel approach to 3D semantic segmentation, distinguished by incorporating a hybrid blend of 2D and 3D computer vision techniques. We conduct 2D semantic segmentation on RGB images linked to 3D point clouds and extend the results to 3D using an extrusion technique for specific class labels. This model serves as the current state-of-the-art 3D semantic segmentation model on the KITTI-360 dataset.
arXiv Detail & Related papers (2024-07-23T00:04:10Z)
LSK3DNet: Towards Effective and Efficient 3D Perception with Large Sparse Kernels [62.31333169413391]
Large Sparse Kernel 3D Neural Network (LSK3DNet) Our method comprises two core components: Spatial-wise Dynamic Sparsity (SDS) and Channel-wise Weight Selection (CWS)
arXiv Detail & Related papers (2024-03-22T12:54:33Z)
E2ENet: Dynamic Sparse Feature Fusion for Accurate and Efficient 3D Medical Image Segmentation [34.865695471451886]
We propose a 3D medical image segmentation model, named Efficient to Efficient Network (E2ENet) It incorporates two parametrically and computationally efficient designs. It consistently achieves a superior trade-off between accuracy and efficiency across various resource constraints.
arXiv Detail & Related papers (2023-12-07T22:13:37Z)
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity [12.663030430488922]
We propose Flash-LLM for enabling low-cost and highly-efficient large generative model inference on high-performance Cores. At SpMM kernel level, Flash-LLM significantly outperforms the state-of-the-art library, i.e., Sputnik and SparTA by an average of 2.9x and 1.5x, respectively.
arXiv Detail & Related papers (2023-09-19T03:20:02Z)
SqueezeLLM: Dense-and-Sparse Quantization [80.32162537942138]
Main bottleneck for generative inference with LLMs is memory bandwidth, rather than compute, for single batch inference. We introduce SqueezeLLM, a post-training quantization framework that enables lossless compression to ultra-low precisions of up to 3-bit. Our framework incorporates two novel ideas: (i) sensitivity-based non-uniform quantization, which searches for the optimal bit precision assignment based on second-order information; and (ii) the Dense-and-Sparse decomposition that stores outliers and sensitive weight values in an efficient sparse format.
arXiv Detail & Related papers (2023-06-13T08:57:54Z)
Scaling Up 3D Kernels with Bayesian Frequency Re-parameterization for Medical Image Segmentation [25.62587471067468]
RepUX-Net is a pure CNN architecture with a simple large kernel block design. Inspired by the spatial frequency in the human visual system, we extend to vary the kernel convergence into element-wise setting.
arXiv Detail & Related papers (2023-03-10T08:38:34Z)
UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation [93.88170217725805]
We propose a 3D medical image segmentation approach, named UNETR++, that offers both high-quality segmentation masks as well as efficiency in terms of parameters, compute cost, and inference speed. The core of our design is the introduction of a novel efficient paired attention (EPA) block that efficiently learns spatial and channel-wise discriminative features. Our evaluations on five benchmarks, Synapse, BTCV, ACDC, BRaTs, and Decathlon-Lung, reveal the effectiveness of our contributions in terms of both efficiency and accuracy.
arXiv Detail & Related papers (2022-12-08T18:59:57Z)
GLEAM: Greedy Learning for Large-Scale Accelerated MRI Reconstruction [50.248694764703714]
Unrolled neural networks have recently achieved state-of-the-art accelerated MRI reconstruction. These networks unroll iterative optimization algorithms by alternating between physics-based consistency and neural-network based regularization. We propose Greedy LEarning for Accelerated MRI reconstruction, an efficient training strategy for high-dimensional imaging settings.
arXiv Detail & Related papers (2022-07-18T06:01:29Z)
Efficient Context-Aware Network for Abdominal Multi-organ Segmentation [8.92337236455273]
We develop a whole-based coarse-to-fine framework for efficient and effective abdominal multi-organ segmentation. For the decoder module, anisotropic convolution with a k*k*1 intra-slice convolution and a 1*1*k inter-slice convolution is designed to reduce the burden. For the context block, we propose strip pooling module to capture anisotropic and long-range contextual information.
arXiv Detail & Related papers (2021-09-22T09:05:59Z)
Multi-Modality Task Cascade for 3D Object Detection [22.131228757850373]
Many methods train two models in isolation and use simple feature concatenation to represent 3D sensor data. We propose a novel Multi-Modality Task Cascade network (MTC-RCNN) that leverages 3D box proposals to improve 2D segmentation predictions. We show that including a 2D network between two stages of 3D modules significantly improves both 2D and 3D task performance.
arXiv Detail & Related papers (2021-07-08T17:55:01Z)
EDNet: Efficient Disparity Estimation with Cost Volume Combination and Attention-based Spatial Residual [17.638034176859932]
Existing disparity estimation works mostly leverage the 4D concatenation volume and construct a very deep 3D convolution neural network (CNN) for disparity regression. In this paper, we propose a network named EDNet for efficient disparity estimation. Experiments on the Scene Flow and KITTI datasets show that EDNet outperforms the previous 3D CNN based works.
arXiv Detail & Related papers (2020-10-26T04:49:44Z)
FADNet: A Fast and Accurate Network for Disparity Estimation [18.05392578461659]
We propose an efficient and accurate deep network for disparity estimation named FADNet. It exploits efficient 2D based correlation layers with stacked blocks to preserve fast computation. It contains multi-scale predictions so as to exploit a multi-scale weight scheduling training technique to improve the accuracy.
arXiv Detail & Related papers (2020-03-24T10:27:11Z)
Recalibrating 3D ConvNets with Project & Excite [6.11737116137921]
Convolutional Neural Networks (F-CNNs) achieve state-of-the-art performance for segmentation tasks in computer vision and medical imaging. We extend existing 2D recalibration methods to 3D and propose a generic compress-process-recalibrate pipeline for easy comparison. We demonstrate that PE modules can be easily integrated into 3D F-CNNs, boosting performance up to 0.3 in Dice Score and outperforming 3D extensions of other recalibration blocks.
arXiv Detail & Related papers (2020-02-25T16:07:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.