U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation
- URL: http://arxiv.org/abs/2406.02918v3
- Date: Thu, 22 Aug 2024 11:55:56 GMT
- Title: U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation
- Authors: Chenxin Li, Xinyu Liu, Wuyang Li, Cheng Wang, Hengyu Liu, Yifan Liu, Zhen Chen, Yixuan Yuan,
- Abstract summary: Kolmogorov-Arnold Networks (KANs) reshape the neural network learning via the stack of non-linear learnable activation functions.
We investigate, modify and re-design the established U-Net pipeline by integrating the dedicated KAN layers on the tokenized intermediate representation, termed U-KAN.
We further delved into the potential of U-KAN as an alternative U-Net noise predictor in diffusion models, demonstrating its applicability in generating task-oriented model architectures.
- Score: 48.40120035775506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: U-Net has become a cornerstone in various visual applications such as image segmentation and diffusion probability models. While numerous innovative designs and improvements have been introduced by incorporating transformers or MLPs, the networks are still limited to linearly modeling patterns as well as the deficient interpretability. To address these challenges, our intuition is inspired by the impressive results of the Kolmogorov-Arnold Networks (KANs) in terms of accuracy and interpretability, which reshape the neural network learning via the stack of non-linear learnable activation functions derived from the Kolmogorov-Anold representation theorem. Specifically, in this paper, we explore the untapped potential of KANs in improving backbones for vision tasks. We investigate, modify and re-design the established U-Net pipeline by integrating the dedicated KAN layers on the tokenized intermediate representation, termed U-KAN. Rigorous medical image segmentation benchmarks verify the superiority of U-KAN by higher accuracy even with less computation cost. We further delved into the potential of U-KAN as an alternative U-Net noise predictor in diffusion models, demonstrating its applicability in generating task-oriented model architectures. These endeavours unveil valuable insights and sheds light on the prospect that with U-KAN, you can make strong backbone for medical image segmentation and generation. Project page:\url{https://yes-u-kan.github.io/}.
Related papers
- Residual Kolmogorov-Arnold Network for Enhanced Deep Learning [0.5852077003870417]
We introduce Residual Arnold, which incorporates the Kolmogorov-KAN framework as a residual component.
Our results demonstrate the potential of RKAN to enhance the capabilities of deep CNNs in visual data.
arXiv Detail & Related papers (2024-10-07T21:12:32Z) - Kolmogorov-Arnold Network Autoencoders [0.0]
Kolmogorov-Arnold Networks (KANs) are promising alternatives to Multi-Layer Perceptrons (MLPs)
KANs align closely with the Kolmogorov-Arnold representation theorem, potentially enhancing both model accuracy and interpretability.
Our results demonstrate that KAN-based autoencoders achieve competitive performance in terms of reconstruction accuracy.
arXiv Detail & Related papers (2024-10-02T22:56:00Z) - Neural Residual Diffusion Models for Deep Scalable Vision Generation [17.931568104324985]
We propose a unified and massively scalable Neural Residual Diffusion Models framework (Neural-RDM)
The proposed neural residual models obtain state-of-the-art scores on image's and video's generative benchmarks.
arXiv Detail & Related papers (2024-06-19T04:57:18Z) - Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity.
Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z) - SODA: Bottleneck Diffusion Models for Representation Learning [75.7331354734152]
We introduce SODA, a self-supervised diffusion model, designed for representation learning.
The model incorporates an image encoder, which distills a source view into a compact representation, that guides the generation of related novel views.
We show that by imposing a tight bottleneck between the encoder and a denoising decoder, we can turn diffusion models into strong representation learners.
arXiv Detail & Related papers (2023-11-29T18:53:34Z) - NAR-Former: Neural Architecture Representation Learning towards Holistic
Attributes Prediction [37.357949900603295]
We propose a neural architecture representation model that can be used to estimate attributes holistically.
Experiment results show that our proposed framework can be used to predict the latency and accuracy attributes of both cell architectures and whole deep neural networks.
arXiv Detail & Related papers (2022-11-15T10:15:21Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - Sparse Flows: Pruning Continuous-depth Models [107.98191032466544]
We show that pruning improves generalization for neural ODEs in generative modeling.
We also show that pruning finds minimal and efficient neural ODE representations with up to 98% less parameters compared to the original network, without loss of accuracy.
arXiv Detail & Related papers (2021-06-24T01:40:17Z) - CoDeGAN: Contrastive Disentanglement for Generative Adversarial Network [0.5437298646956507]
Disentanglement, a critical concern in interpretable machine learning, has also garnered significant attention from the computer vision community.
We propose textttCoDeGAN, where we relax similarity constraints for disentanglement from the image domain to the feature domain.
We integrate self-supervised pre-training into CoDeGAN to learn semantic representations, significantly facilitating unsupervised disentanglement.
arXiv Detail & Related papers (2021-03-05T12:44:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.