DE-KAN: A Kolmogorov Arnold Network with Dual Encoder for accurate 2D Teeth Segmentation
- URL: http://arxiv.org/abs/2511.18533v1
- Date: Sun, 23 Nov 2025 16:56:20 GMT
- Title: DE-KAN: A Kolmogorov Arnold Network with Dual Encoder for accurate 2D Teeth Segmentation
- Authors: Md Mizanur Rahman Mustakim, Jianwu Li, Sumya Bhuiyan, Mohammad Mehedi Hasan, Bing Han,
- Abstract summary: We propose a novel Dual Kolmogorov Arnold Network, which enhances feature representation and segmentation precision.<n>The framework employs a ResNet-18 encoder for augmented inputs and a customized CNN encoder for original inputs, enabling the complementary extraction of global and local spatial features.<n>Experiments on two benchmark dental X-ray datasets demonstrate that DE-KAN outperforms state-of-the-art segmentation models.
- Score: 6.540491857899706
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate segmentation of individual teeth from panoramic radiographs remains a challenging task due to anatomical variations, irregular tooth shapes, and overlapping structures. These complexities often limit the performance of conventional deep learning models. To address this, we propose DE-KAN, a novel Dual Encoder Kolmogorov Arnold Network, which enhances feature representation and segmentation precision. The framework employs a ResNet-18 encoder for augmented inputs and a customized CNN encoder for original inputs, enabling the complementary extraction of global and local spatial features. These features are fused through KAN-based bottleneck layers, incorporating nonlinear learnable activation functions derived from the Kolmogorov Arnold representation theorem to improve learning capacity and interpretability. Extensive experiments on two benchmark dental X-ray datasets demonstrate that DE-KAN outperforms state-of-the-art segmentation models, achieving mIoU of 94.5%, Dice coefficient of 97.1%, accuracy of 98.91%, and recall of 97.36%, representing up to +4.7% improvement in Dice compared to existing methods.
Related papers
- An Efficient Dual-Line Decoder Network with Multi-Scale Convolutional Attention for Multi-organ Segmentation [5.6873464177873245]
This paper introduces an efficient dual-line decoder segmentation network (EDLDNet)<n>The proposed method features a noisy decoder, which learns to incorporate structured perturbation at training time for better model robustness.<n>By leveraging multi-scale segmentation masks from both decoders, we also utilize a mutation-based loss function to enhance the model's generalization.
arXiv Detail & Related papers (2025-08-23T12:34:27Z) - Kolmogorov-Arnold Network for Remote Sensing Image Semantic Segmentation [8.891804836416275]
We propose a novel semantic segmentation network, namely DeepKANSeg.<n>First, we introduce a KAN-based deep feature refinement module, namely DeepKAN.<n>Second, we replace the traditional multi-layer perceptron (MLP) layers in the global-local combined decoder with KAN-based linear layers, namely GLKAN.
arXiv Detail & Related papers (2025-01-13T15:06:51Z) - DPE-Net: Dual-Parallel Encoder Based Network for Semantic Segmentation of Polyps [0.0]
In medical imaging, efficient segmentation of colon polyps plays a pivotal role in minimally invasive solutions for colorectal cancer.<n>This study introduces a novel approach employing two parallel encoder branches within a network for polyp segmentation.<n>One branch of the encoder incorporates the dual convolution blocks that have the capability to maintain feature information over increased depths.<n>The other block embraces the single convolution block with the addition of the previous layer's feature, offering diversity in feature extraction within the encoder.
arXiv Detail & Related papers (2024-12-01T16:56:03Z) - An Effective UNet Using Feature Interaction and Fusion for Organ Segmentation in Medical Image [5.510679875888542]
A novel U-shaped model is proposed to address the above issue, including three plug-and-play modules.<n>A channel spatial interaction module is introduced to improve the quality of skip connection features by modeling inter-stage interactions between the encoder and decoder.<n>A channel attention-based module integrating squeeze-and-excitation mechanisms with convolutional layers is employed in the decoder blocks to strengthen the representation of critical features while suppressing irrelevant ones.<n>A multi-level fusion module is designed to aggregate multi-scale decoder features, improving spatial detail and consistency in the final prediction.
arXiv Detail & Related papers (2024-09-09T04:34:47Z) - Handling Geometric Domain Shifts in Semantic Segmentation of Surgical RGB and Hyperspectral Images [67.66644395272075]
We present first analysis of state-of-the-art semantic segmentation models when faced with geometric out-of-distribution data.
We propose an augmentation technique called "Organ Transplantation" to enhance generalizability.
Our augmentation technique improves SOA model performance by up to 67 % for RGB data and 90 % for HSI data, achieving performance at the level of in-distribution performance on real OOD test data.
arXiv Detail & Related papers (2024-08-27T19:13:15Z) - U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation [48.40120035775506]
Kolmogorov-Arnold Networks (KANs) reshape the neural network learning via the stack of non-linear learnable activation functions.
We investigate, modify and re-design the established U-Net pipeline by integrating the dedicated KAN layers on the tokenized intermediate representation, termed U-KAN.
We further delved into the potential of U-KAN as an alternative U-Net noise predictor in diffusion models, demonstrating its applicability in generating task-oriented model architectures.
arXiv Detail & Related papers (2024-06-05T04:13:03Z) - LHU-Net: a Lean Hybrid U-Net for Cost-efficient, High-performance Volumetric Segmentation [4.168081528698768]
We propose LHU-Net, a Lean Hybrid U-Net for volumetric medical image segmentation.<n>LHU-Net prioritizes spatial feature extraction before refining channel features, optimizing both efficiency and accuracy.<n> evaluated on four benchmark datasets (Synapse, Left Atrial, BraTS-Decathlon, and Lung-Decathlon)
arXiv Detail & Related papers (2024-04-07T22:58:18Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - Unite-Divide-Unite: Joint Boosting Trunk and Structure for High-accuracy
Dichotomous Image Segmentation [48.995367430746086]
High-accuracy Dichotomous Image rendering (DIS) aims to pinpoint category-agnostic foreground objects from natural scenes.
We introduce a novel Unite-Divide-Unite Network (UDUN) that restructures and bipartitely arranges complementary features to boost the effectiveness of trunk and structure identification.
Using 1024*1024 input, our model enables real-time inference at 65.3 fps with ResNet-18.
arXiv Detail & Related papers (2023-07-26T09:04:35Z) - EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for
Mobile Vision Applications [68.35683849098105]
We introduce split depth-wise transpose attention (SDTA) encoder that splits input tensors into multiple channel groups.
Our EdgeNeXt model with 1.3M parameters achieves 71.2% top-1 accuracy on ImageNet-1K.
Our EdgeNeXt model with 5.6M parameters achieves 79.4% top-1 accuracy on ImageNet-1K.
arXiv Detail & Related papers (2022-06-21T17:59:56Z) - Deep ensembles based on Stochastic Activation Selection for Polyp
Segmentation [82.61182037130406]
This work deals with medical image segmentation and in particular with accurate polyp detection and segmentation during colonoscopy examinations.
Basic architecture in image segmentation consists of an encoder and a decoder.
We compare some variant of the DeepLab architecture obtained by varying the decoder backbone.
arXiv Detail & Related papers (2021-04-02T02:07:37Z) - SAR-U-Net: squeeze-and-excitation block and atrous spatial pyramid
pooling based residual U-Net for automatic liver CT segmentation [3.192503074844775]
A modified U-Net based framework is presented, which leverages techniques from Squeeze-and-Excitation (SE) block, Atrous Spatial Pyramid Pooling (ASPP) and residual learning.
The effectiveness of the proposed method was tested on two public datasets LiTS17 and SLiver07.
arXiv Detail & Related papers (2021-03-11T02:32:59Z) - End-to-End Multi-speaker Speech Recognition with Transformer [88.22355110349933]
We replace the RNN-based encoder-decoder in the speech recognition model with a Transformer architecture.
We also modify the self-attention component to be restricted to a segment rather than the whole sequence in order to reduce computation.
arXiv Detail & Related papers (2020-02-10T16:29:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.