Implicit U-KAN2.0: Dynamic, Efficient and Interpretable Medical Image Segmentation
- URL: http://arxiv.org/abs/2503.03141v1
- Date: Wed, 05 Mar 2025 03:31:05 GMT
- Title: Implicit U-KAN2.0: Dynamic, Efficient and Interpretable Medical Image Segmentation
- Authors: Chun-Wun Cheng, Yining Zhao, Yanqi Cheng, Javier Montoya, Carola-Bibiane Schönlieb, Angelica I Aviles-Rivero,
- Abstract summary: Implicit U-KAN 2.0 is an implicit deep neural network incorporating MultiKAN and second order NODEs, improving interpretability and performance.<n>We conduct extensive experiments on a variety of 2D and a single 3D dataset, demonstrating that our model consistently outperforms existing segmentation networks.
- Score: 9.901684172184858
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image segmentation is a fundamental task in both image analysis and medical applications. State-of-the-art methods predominantly rely on encoder-decoder architectures with a U-shaped design, commonly referred to as U-Net. Recent advancements integrating transformers and MLPs improve performance but still face key limitations, such as poor interpretability, difficulty handling intrinsic noise, and constrained expressiveness due to discrete layer structures, often lacking a solid theoretical foundation.In this work, we introduce Implicit U-KAN 2.0, a novel U-Net variant that adopts a two-phase encoder-decoder structure. In the SONO phase, we use a second-order neural ordinary differential equation (NODEs), called the SONO block, for a more efficient, expressive, and theoretically grounded modeling approach. In the SONO-MultiKAN phase, we integrate the second-order NODEs and MultiKAN layer as the core computational block to enhance interpretability and representation power. Our contributions are threefold. First, U-KAN 2.0 is an implicit deep neural network incorporating MultiKAN and second order NODEs, improving interpretability and performance while reducing computational costs. Second, we provide a theoretical analysis demonstrating that the approximation ability of the MultiKAN block is independent of the input dimension. Third, we conduct extensive experiments on a variety of 2D and a single 3D dataset, demonstrating that our model consistently outperforms existing segmentation networks.
Related papers
- BHViT: Binarized Hybrid Vision Transformer [53.38894971164072]
Model binarization has made significant progress in enabling real-time and energy-efficient computation for convolutional neural networks (CNN)<n>We propose BHViT, a binarization-friendly hybrid ViT architecture and its full binarization model with the guidance of three important observations.<n>Our proposed algorithm achieves SOTA performance among binary ViT methods.
arXiv Detail & Related papers (2025-03-04T08:35:01Z) - BDC-Occ: Binarized Deep Convolution Unit For Binarized Occupancy Network [55.21288428359509]
Existing 3D occupancy networks demand significant hardware resources, hindering the deployment of edge devices.
We propose a novel binarized deep convolution (BDC) unit that effectively enhances performance while increasing the number of binarized convolutional layers.
Our BDC-Occ model is created by applying the proposed BDC unit to binarize the existing 3D occupancy networks.
arXiv Detail & Related papers (2024-05-27T10:44:05Z) - E2ENet: Dynamic Sparse Feature Fusion for Accurate and Efficient 3D Medical Image Segmentation [34.865695471451886]
We propose a 3D medical image segmentation model, named Efficient to Efficient Network (E2ENet)
It incorporates two parametrically and computationally efficient designs.
It consistently achieves a superior trade-off between accuracy and efficiency across various resource constraints.
arXiv Detail & Related papers (2023-12-07T22:13:37Z) - Complexity Matters: Rethinking the Latent Space for Generative Modeling [65.64763873078114]
In generative modeling, numerous successful approaches leverage a low-dimensional latent space, e.g., Stable Diffusion.
In this study, we aim to shed light on this under-explored topic by rethinking the latent space from the perspective of model complexity.
arXiv Detail & Related papers (2023-07-17T07:12:29Z) - General-Purpose Multimodal Transformer meets Remote Sensing Semantic
Segmentation [35.100738362291416]
Multimodal AI seeks to exploit complementary data sources, particularly for complex tasks like semantic segmentation.
Recent trends in general-purpose multimodal networks have shown great potential to achieve state-of-the-art performance.
We propose a UNet-inspired module that employs 3D convolution to encode vital local information and learn cross-modal features simultaneously.
arXiv Detail & Related papers (2023-07-07T04:58:34Z) - Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features.
Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z) - HKNAS: Classification of Hyperspectral Imagery Based on Hyper Kernel
Neural Architecture Search [104.45426861115972]
We propose to directly generate structural parameters by utilizing the specifically designed hyper kernels.
We obtain three kinds of networks to separately conduct pixel-level or image-level classifications with 1-D or 3-D convolutions.
A series of experiments on six public datasets demonstrate that the proposed methods achieve state-of-the-art results.
arXiv Detail & Related papers (2023-04-23T17:27:40Z) - Continuous U-Net: Faster, Greater and Noiseless [2.6163085620813287]
We introduce continuous U-Net, a novel family of networks for image segmentation.
We provide theoretical guarantees for our network demonstrating faster convergence, higher robustness and less sensitivity to noise.
We demonstrate, through extensive numerical and visual results, that our model outperforms existing U-Net blocks for several medical image segmentation benchmarking datasets.
arXiv Detail & Related papers (2023-02-01T17:46:00Z) - D3C2-Net: Dual-Domain Deep Convolutional Coding Network for Compressive
Sensing [9.014593915305069]
Deep unfolding networks (DUNs) have achieved impressive success in compressive sensing (CS)
By unfolding the proposed framework into deep neural networks, we further design a novel Dual-Domain Deep Convolutional Coding Network (D3C2-Net)
Experiments on natural and MR images demonstrate that our D3C2-Net achieves higher performance and better accuracy-complexity trade-offs than other state-of-the-arts.
arXiv Detail & Related papers (2022-07-27T14:52:32Z) - A Data-scalable Transformer for Medical Image Segmentation:
Architecture, Model Efficiency, and Benchmark [45.543140413399506]
MedFormer is a data-scalable Transformer designed for generalizable 3D medical image segmentation.
Our approach incorporates three key elements: a desirable inductive bias, hierarchical modeling with linear-complexity attention, and multi-scale feature fusion.
arXiv Detail & Related papers (2022-02-28T22:59:42Z) - Discrete-Valued Neural Communication [85.3675647398994]
We show that restricting the transmitted information among components to discrete representations is a beneficial bottleneck.
Even though individuals have different understandings of what a "cat" is based on their specific experiences, the shared discrete token makes it possible for communication among individuals to be unimpeded by individual differences in internal representation.
We extend the quantization mechanism from the Vector-Quantized Variational Autoencoder to multi-headed discretization with shared codebooks and use it for discrete-valued neural communication.
arXiv Detail & Related papers (2021-07-06T03:09:25Z) - CovSegNet: A Multi Encoder-Decoder Architecture for Improved Lesion
Segmentation of COVID-19 Chest CT Scans [11.946078871080836]
An automated COVID-19 lesion segmentation scheme is proposed utilizing a highly efficient neural network architecture, namely CovSegNet.
Outstanding performances have been achieved in three publicly available datasets that largely outperform other state-of-the-art approaches.
arXiv Detail & Related papers (2020-12-02T19:26:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.