Related papers: Med-DANet V2: A Flexible Dynamic Architecture for Efficient Medical Volumetric Segmentation

Med-DANet V2: A Flexible Dynamic Architecture for Efficient Medical Volumetric Segmentation

URL: http://arxiv.org/abs/2310.18656v1
Date: Sat, 28 Oct 2023 09:57:28 GMT
Title: Med-DANet V2: A Flexible Dynamic Architecture for Efficient Medical Volumetric Segmentation
Authors: Haoran Shen, Yifu Zhang, Wenxuan Wang, Chen Chen, Jing Liu, Shanshan Song, Jiangyun Li
Abstract summary: A dynamic architecture network for medical segmentation (i.e. Med-DANet) has achieved a favorable accuracy and efficiency trade-off. This paper explores a unified formulation of the dynamic inference framework from the perspective of both the data itself and the model structure. Our framework improves the model efficiency by up to nearly 4.1 and 17.3 times with comparable segmentation results on BraTS 2019.
Score: 29.082411035685773
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent works have shown that the computational efficiency of 3D medical image (e.g. CT and MRI) segmentation can be impressively improved by dynamic inference based on slice-wise complexity. As a pioneering work, a dynamic architecture network for medical volumetric segmentation (i.e. Med-DANet) has achieved a favorable accuracy and efficiency trade-off by dynamically selecting a suitable 2D candidate model from the pre-defined model bank for different slices. However, the issues of incomplete data analysis, high training costs, and the two-stage pipeline in Med-DANet require further improvement. To this end, this paper further explores a unified formulation of the dynamic inference framework from the perspective of both the data itself and the model structure. For each slice of the input volume, our proposed method dynamically selects an important foreground region for segmentation based on the policy generated by our Decision Network and Crop Position Network. Besides, we propose to insert a stage-wise quantization selector to the employed segmentation model (e.g. U-Net) for dynamic architecture adapting. Extensive experiments on BraTS 2019 and 2020 show that our method achieves comparable or better performance than previous state-of-the-art methods with much less model complexity. Compared with previous methods Med-DANet and TransBTS with dynamic and static architecture respectively, our framework improves the model efficiency by up to nearly 4.1 and 17.3 times with comparable segmentation results on BraTS 2019.

Related papers

ContextFormer: Redefining Efficiency in Semantic Segmentation [48.81126061219231]
Convolutional methods, although capturing local dependencies well, struggle with long-range relationships. Vision Transformers (ViTs) excel in global context capture but are hindered by high computational demands. We propose ContextFormer, a hybrid framework leveraging the strengths of CNNs and ViTs in the bottleneck to balance efficiency, accuracy, and robustness for real-time semantic segmentation.
arXiv Detail & Related papers (2025-01-31T16:11:04Z)
VICON: Vision In-Context Operator Networks for Multi-Physics Fluid Dynamics Prediction [21.061630022134203]
In-Context Operator Networks (ICONs) learn operators across different types of PDEs using a few-shot, in-context approach. Existing methods treat each data point as a single token, and suffer from computational inefficiency when processing dense data. We propose Vision In-Context Operator Networks (VICON), incorporating a vision transformer architecture that efficiently processes 2D functions through patch-wise operations.
arXiv Detail & Related papers (2024-11-25T03:25:17Z)
AMAES: Augmented Masked Autoencoder Pretraining on Public Brain MRI Data for 3D-Native Segmentation [2.0749231618270803]
This study investigates the impact of self-supervised pretraining of 3D semantic segmentation models on a large-scale, domain-specific dataset. We introduce BRAINS-45K, a dataset of 44,756 brain MRI volumes from public sources, the largest public dataset available.
arXiv Detail & Related papers (2024-08-01T15:27:48Z)
Multi-body SE(3) Equivariance for Unsupervised Rigid Segmentation and Motion Estimation [49.56131393810713]
We present an SE(3) equivariant architecture and a training strategy to tackle this task in an unsupervised manner. Our method excels in both model performance and computational efficiency, with only 0.25M parameters and 0.92G FLOPs.
arXiv Detail & Related papers (2023-06-08T22:55:32Z)
Prompt Tuning for Parameter-efficient Medical Image Segmentation [79.09285179181225]
We propose and investigate several contributions to achieve a parameter-efficient but effective adaptation for semantic segmentation on two medical imaging datasets. We pre-train this architecture with a dedicated dense self-supervision scheme based on assignments to online generated prototypes. We demonstrate that the resulting neural network model is able to attenuate the gap between fully fine-tuned and parameter-efficiently adapted models.
arXiv Detail & Related papers (2022-11-16T21:55:05Z)
Entity-Graph Enhanced Cross-Modal Pretraining for Instance-level Product Retrieval [152.3504607706575]
This research aims to conduct weakly-supervised multi-modal instance-level product retrieval for fine-grained product categories. We first contribute the Product1M datasets, and define two real practical instance-level retrieval tasks. We exploit to train a more effective cross-modal model which is adaptively capable of incorporating key concept information from the multi-modal data.
arXiv Detail & Related papers (2022-06-17T15:40:45Z)
Med-DANet: Dynamic Architecture Network for Efficient Medical Volumetric Segmentation [13.158995287578316]
We propose a dynamic architecture network named Med-DANet to achieve effective accuracy and efficiency trade-off. For each slice of the input 3D MRI volume, our proposed method learns a slice-specific decision by the Decision Network. Our proposed method achieves comparable or better results than previous state-of-the-art methods for 3D MRI brain tumor segmentation.
arXiv Detail & Related papers (2022-06-14T03:25:58Z)
Learning Intermediate Representations using Graph Neural Networks for NUMA and Prefetchers Optimization [1.3999481573773074]
This paper demonstrates how the static Intermediate Representation (IR) of the code can guide NUMA/prefetcher optimizations without the prohibitive cost of performance profiling. We show that our static intermediate representation based model achieves 80% of the performance gains provided by expensive dynamic performance profiling based strategies.
arXiv Detail & Related papers (2022-03-01T16:51:30Z)
DANCE: DAta-Network Co-optimization for Efficient Segmentation Model Training and Inference [85.02494022662505]
DANCE is an automated simultaneous data-network co-optimization for efficient segmentation model training and inference. It integrates automated data slimming which adaptively downsamples/drops input images and controls their corresponding contribution to the training loss guided by the images' spatial complexity. Experiments and ablating studies demonstrate that DANCE can achieve "all-win" towards efficient segmentation.
arXiv Detail & Related papers (2021-07-16T04:58:58Z)
Dynamic Dual Sampling Module for Fine-Grained Semantic Segmentation [27.624291416260185]
We propose a Dynamic Dual Sampling Module (DDSM) to conduct dynamic affinity modeling and propagate semantic context to local details. Experiment results on both City and Camvid datasets validate the effectiveness and efficiency of the proposed approach.
arXiv Detail & Related papers (2021-05-25T04:25:47Z)
A Progressive Sub-Network Searching Framework for Dynamic Inference [33.93841415140311]
We propose a progressive sub-net searching framework, which is embedded with several effective techniques, including trainable noise ranking, channel group and fine-tuning threshold setting, sub-nets re-selection. Our proposed method achieves much better dynamic inference accuracy compared with prior popular Universally-Slimmable-Network by 4.4%-maximally and 2.3%-averagely in ImageNet dataset with the same model size.
arXiv Detail & Related papers (2020-09-11T22:56:02Z)
Mix Dimension in Poincar\'{e} Geometry for 3D Skeleton-based Action Recognition [57.98278794950759]
Graph Convolutional Networks (GCNs) have already demonstrated their powerful ability to model the irregular data. We present a novel spatial-temporal GCN architecture which is defined via the Poincar'e geometry. We evaluate our method on two current largest scale 3D datasets.
arXiv Detail & Related papers (2020-07-30T18:23:18Z)
Dynamic Memory Induction Networks for Few-Shot Text Classification [84.88381813651971]
This paper proposes Dynamic Memory Induction Networks (DMIN) for few-shot text classification. The proposed model achieves new state-of-the-art results on the miniRCV1 and ODIC dataset, improving the best performance (accuracy) by 24%.
arXiv Detail & Related papers (2020-05-12T12:41:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.