ReCo-KD: Region- and Context-Aware Knowledge Distillation for Efficient 3D Medical Image Segmentation
- URL: http://arxiv.org/abs/2601.08301v1
- Date: Tue, 13 Jan 2026 07:44:43 GMT
- Title: ReCo-KD: Region- and Context-Aware Knowledge Distillation for Efficient 3D Medical Image Segmentation
- Authors: Qizhen Lan, Yu-Chun Hsu, Nida Saddaf Khan, Xiaoqian Jiang,
- Abstract summary: Region- and Context-aware Knowledge Distillation (ReCo-KD) is a training-only framework that transfers both fine-grained anatomical detail and long-range contextual information from a high-capacity teacher to a compact student network.<n>We show that ReCo-KD attains accuracy close to the teacher while markedly reducing parameters and inference, underscoring its practicality for clinical deployment.
- Score: 6.3354754356733
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate 3D medical image segmentation is vital for diagnosis and treatment planning, but state-of-the-art models are often too large for clinics with limited computing resources. Lightweight architectures typically suffer significant performance loss. To address these deployment and speed constraints, we propose Region- and Context-aware Knowledge Distillation (ReCo-KD), a training-only framework that transfers both fine-grained anatomical detail and long-range contextual information from a high-capacity teacher to a compact student network. The framework integrates Multi-Scale Structure-Aware Region Distillation (MS-SARD), which applies class-aware masks and scale-normalized weighting to emphasize small but clinically important regions, and Multi-Scale Context Alignment (MS-CA), which aligns teacher-student affinity patterns across feature levels. Implemented on nnU-Net in a backbone-agnostic manner, ReCo-KD requires no custom student design and is easily adapted to other architectures. Experiments on multiple public 3D medical segmentation datasets and a challenging aggregated dataset show that the distilled lightweight model attains accuracy close to the teacher while markedly reducing parameters and inference latency, underscoring its practicality for clinical deployment.
Related papers
- TGC-Net: A Structure-Aware and Semantically-Aligned Framework for Text-Guided Medical Image Segmentation [56.09179939570486]
We propose TGC-Net, a CLIP-based framework focusing on parameter-efficient, task-specific adaptations.<n>TGC-Net achieves state-of-the-art performance with substantially fewer trainable parameters, including notable Dice gains on challenging benchmarks.
arXiv Detail & Related papers (2025-12-24T12:06:26Z) - Self-Supervised Anatomical Consistency Learning for Vision-Grounded Medical Report Generation [61.350584471060756]
Vision-grounded medical report generation aims to produce clinically accurate descriptions of medical images.<n>We propose Self-Supervised Anatomical Consistency Learning (SS-ACL) to align generated reports with corresponding anatomical regions.<n>SS-ACL constructs a hierarchical anatomical graph inspired by the invariant top-down inclusion structure of human anatomy.
arXiv Detail & Related papers (2025-09-30T08:59:06Z) - DiSSECT: Structuring Transfer-Ready Medical Image Representations through Discrete Self-Supervision [9.254163621425727]
DiSSECT is a framework that integrates multi-scale vector quantization into the SSL pipeline to impose a discrete representational bottleneck.<n>It achieves strong performance on both classification and segmentation tasks, requiring minimal or no fine-tuning.<n>We validate DiSSECT across multiple public medical imaging datasets, demonstrating its robustness and generalizability.
arXiv Detail & Related papers (2025-09-23T07:58:21Z) - Foundation Model for Whole-Heart Segmentation: Leveraging Student-Teacher Learning in Multi-Modal Medical Imaging [0.510750648708198]
Whole-heart segmentation from CT and MRI scans is crucial for cardiovascular disease analysis.<n>Existing methods struggle with modality-specific biases and the need for extensive labeled datasets.<n>We propose a foundation model for whole-heart segmentation using a self-supervised learning framework based on a student-teacher architecture.
arXiv Detail & Related papers (2025-03-24T14:47:54Z) - A Continual Learning-driven Model for Accurate and Generalizable Segmentation of Clinically Comprehensive and Fine-grained Whole-body Anatomies in CT [67.34586036959793]
There is no fully annotated CT dataset with all anatomies delineated for training.<n>We propose a novel continual learning-driven CT model that can segment complete anatomies.<n>Our single unified CT segmentation model, CL-Net, can highly accurately segment a clinically comprehensive set of 235 fine-grained whole-body anatomies.
arXiv Detail & Related papers (2025-03-16T23:55:02Z) - Perspective+ Unet: Enhancing Segmentation with Bi-Path Fusion and Efficient Non-Local Attention for Superior Receptive Fields [19.71033340093199]
We propose a novel architecture, Perspective+ Unet, to overcome limitations in medical image segmentation.
The framework incorporates an efficient non-local transformer block, named ENLTB, which utilizes kernel function approximation for effective long-range dependency capture.
Experimental results on the ACDC and datasets demonstrate the effectiveness of our proposed Perspective+ Unet.
arXiv Detail & Related papers (2024-06-20T07:17:39Z) - MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation [25.74088298769155]
We propose a universal training framework called MedContext for 3D medical segmentation.
Our approach effectively learns self supervised contextual cues jointly with the supervised voxel segmentation task.
The effectiveness of MedContext is validated across multiple 3D medical datasets and four state-of-the-art model architectures.
arXiv Detail & Related papers (2024-02-27T17:58:05Z) - Leveraging Frequency Domain Learning in 3D Vessel Segmentation [50.54833091336862]
In this study, we leverage Fourier domain learning as a substitute for multi-scale convolutional kernels in 3D hierarchical segmentation models.
We show that our novel network achieves remarkable dice performance (84.37% on ASACA500 and 80.32% on ImageCAS) in tubular vessel segmentation tasks.
arXiv Detail & Related papers (2024-01-11T19:07:58Z) - Disruptive Autoencoders: Leveraging Low-level features for 3D Medical
Image Pre-training [51.16994853817024]
This work focuses on designing an effective pre-training framework for 3D radiology images.
We introduce Disruptive Autoencoders, a pre-training framework that attempts to reconstruct the original image from disruptions created by a combination of local masking and low-level perturbations.
The proposed pre-training framework is tested across multiple downstream tasks and achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-07-31T17:59:42Z) - Class Attention to Regions of Lesion for Imbalanced Medical Image
Recognition [59.28732531600606]
We propose a framework named textbfClass textbfAttention to textbfREgions of the lesion (CARE) to handle data imbalance issues.
The CARE framework needs bounding boxes to represent the lesion regions of rare diseases.
Results show that the CARE variants with automated bounding box generation are comparable to the original CARE framework.
arXiv Detail & Related papers (2023-07-19T15:19:02Z) - Efficient Medical Image Segmentation Based on Knowledge Distillation [30.857487609003197]
We propose an efficient architecture by distilling knowledge from well-trained medical image segmentation networks to train another lightweight network.
We also devise a novel distillation module tailored for medical image segmentation to transfer semantic region information from teacher to student network.
We demonstrate that a lightweight network distilled by our method has non-negligible value in the scenario which requires relatively high operating speed and low storage usage.
arXiv Detail & Related papers (2021-08-23T07:41:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.