Multimodal Contrastive Pretraining of CBCT and IOS for Enhanced Tooth Segmentation
- URL: http://arxiv.org/abs/2509.07923v1
- Date: Tue, 09 Sep 2025 17:05:04 GMT
- Title: Multimodal Contrastive Pretraining of CBCT and IOS for Enhanced Tooth Segmentation
- Authors: Moo Hyun Son, Juyoung Bae, Zelin Qiu, Jiale Peng, Kai Xin Li, Yifan Lin, Hao Chen,
- Abstract summary: We present ToothMCL, a Tooth Multimodal Contrastive Learning for pretraining that integrates volumetric (CBCT) and surface-based (IOS) modalities.<n>Our approach effectively models fine-grained anatomical features, enabling precise multi-class segmentation and accurate identification of F'ederation Dentaire Internationale (FDI) tooth numbering.
- Score: 8.574756499299374
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Digital dentistry represents a transformative shift in modern dental practice. The foundational step in this transformation is the accurate digital representation of the patient's dentition, which is obtained from segmented Cone-Beam Computed Tomography (CBCT) and Intraoral Scans (IOS). Despite the growing interest in digital dental technologies, existing segmentation methodologies frequently lack rigorous validation and demonstrate limited performance and clinical applicability. To the best of our knowledge, this is the first work to introduce a multimodal pretraining framework for tooth segmentation. We present ToothMCL, a Tooth Multimodal Contrastive Learning for pretraining that integrates volumetric (CBCT) and surface-based (IOS) modalities. By capturing modality-invariant representations through multimodal contrastive learning, our approach effectively models fine-grained anatomical features, enabling precise multi-class segmentation and accurate identification of F\'ed\'eration Dentaire Internationale (FDI) tooth numbering. Along with the framework, we curated CBCT-IOS3.8K, the largest paired CBCT and IOS dataset to date, comprising 3,867 patients. We then evaluated ToothMCL on a comprehensive collection of independent datasets, representing the largest and most diverse evaluation to date. Our method achieves state-of-the-art performance in both internal and external testing, with an increase of 12\% for CBCT segmentation and 8\% for IOS segmentation in the Dice Similarity Coefficient (DSC). Furthermore, ToothMCL consistently surpasses existing approaches in tooth groups and demonstrates robust generalizability across varying imaging conditions and clinical scenarios.
Related papers
- MedSeqFT: Sequential Fine-tuning Foundation Models for 3D Medical Image Segmentation [55.37355146924576]
MedSeqFT is a sequential fine-tuning framework for medical image analysis.<n>It adapts pre-trained models to new tasks while refining their representational capacity.<n>It consistently outperforms state-of-the-art fine-tuning strategies.
arXiv Detail & Related papers (2025-09-07T15:22:53Z) - Adapting Foundation Model for Dental Caries Detection with Dual-View Co-Training [53.77904429789069]
We present Attention-TNet, a novel Dual-View Co-Training network for accurate dental caries detection.<n>OurTNet starts with employing automated tooth detection to establish two complementary views: a global view from panoramic X-ray images and a local view from cropped tooth images.<n>To effectively integrate information from both views, we introduce a Gated Cross-View module.
arXiv Detail & Related papers (2025-08-28T14:13:26Z) - Multi-Phase Automated Segmentation of Dental Structures in CBCT Using a Lightweight Auto3DSeg and SegResNet Implementation [0.0]
Cone-beam computed tomography (CBCT) has become an invaluable imaging modality in dentistry, enabling 3D visualization of teeth and surrounding structures for diagnosis and treatment planning.<n>We describe the DLaBella29 team's approach for the MICCAI 2025 ToothFairy3 Challenge, which involves a deep learning pipeline for multi-class tooth segmentation.<n>Key preprocessing steps included image resampling to 0.6 mm isotropic resolution and intensity clipping.<n>Our method achieved an average Dice of 0.87 on the ToothFairy3 challenge out-of-sample validation set.
arXiv Detail & Related papers (2025-08-18T14:35:26Z) - A Continual Learning-driven Model for Accurate and Generalizable Segmentation of Clinically Comprehensive and Fine-grained Whole-body Anatomies in CT [67.34586036959793]
There is no fully annotated CT dataset with all anatomies delineated for training.<n>We propose a novel continual learning-driven CT model that can segment complete anatomies.<n>Our single unified CT segmentation model, CL-Net, can highly accurately segment a clinically comprehensive set of 235 fine-grained whole-body anatomies.
arXiv Detail & Related papers (2025-03-16T23:55:02Z) - A Multi-Stage Framework for 3D Individual Tooth Segmentation in Dental CBCT [7.6057981800052845]
Cone beam computed tomography (CBCT) is a common way of diagnosing dental diseases.
Deep learning based methods have achieved convincing results in medical image processing.
We propose a multi-stage framework for 3D tooth related generalization in dental CBCT.
arXiv Detail & Related papers (2024-07-15T04:23:28Z) - Sparse Anatomical Prompt Semi-Supervised Learning with Masked Image Modeling for CBCT Tooth Segmentation [9.373643627609336]
tooth identification and segmentation in Cone Beam Computed Tomography (CBCT) dental images can significantly enhance the efficiency and precision of manual diagnoses performed by dentists.<n>Existing segmentation methods are mainly developed based on large data volumes training, on which their annotations are extremely time-consuming.<n>This study proposes a tasked-oriented Masked Auto-Encoder paradigm to effectively utilize large amounts of unlabeled data to achieve accurate tooth segmentation with limited labeled data.
arXiv Detail & Related papers (2024-02-07T05:05:21Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - A Deep Learning Approach to Teeth Segmentation and Orientation from Panoramic X-rays [0.0]
We present a comprehensive approach to teeth segmentation and orientation from panoramic X-ray images, leveraging deep-learning techniques.<n>We built an end-to-end instance segmentation network that uses an encoder-decoder architecture reinforced with grid-aware attention gates.<n>We introduce oriented bounding box (OBB) generation through principal component analysis (PCA) for precise tooth orientation estimation.
arXiv Detail & Related papers (2023-10-26T06:01:25Z) - Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images [55.83984261827332]
In this paper, we propose a novel reliable multi-scale wavelet-enhanced transformer network.
We develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module.
Our proposed method achieves better segmentation accuracy with a high degree of reliability as compared to other state-of-the-art segmentation approaches.
arXiv Detail & Related papers (2022-12-01T07:32:56Z) - AI-enabled Automatic Multimodal Fusion of Cone-Beam CT and Intraoral
Scans for Intelligent 3D Tooth-Bone Reconstruction and Clinical Applications [29.065668174732014]
A critical step in virtual dental treatment planning is to accurately delineate all tooth-bone structures from CBCT.
Previous studies have established several methods for CBCT segmentation using deep learning.
Here, we present a Deep Dental Multimodal Analysis framework consisting of a CBCT segmentation model, an intraoral scan (IOS) segmentation model, and a fusion model to generate 3D fused crown-root-bone structures.
arXiv Detail & Related papers (2022-03-11T07:50:15Z) - Two-Stage Mesh Deep Learning for Automated Tooth Segmentation and
Landmark Localization on 3D Intraoral Scans [56.55092443401416]
emphiMeshSegNet in the first stage of TS-MDL reached an averaged Dice similarity coefficient (DSC) at 0.953pm0.076$, significantly outperforming the original MeshSegNet.
PointNet-Reg achieved a mean absolute error (MAE) of $0.623pm0.718, mm$ in distances between the prediction and ground truth for $44$ landmarks, which is superior compared with other networks for landmark detection.
arXiv Detail & Related papers (2021-09-24T13:00:26Z) - Co-Heterogeneous and Adaptive Segmentation from Multi-Source and
Multi-Phase CT Imaging Data: A Study on Pathological Liver and Lesion
Segmentation [48.504790189796836]
We present a novel segmentation strategy, co-heterogenous and adaptive segmentation (CHASe)
We propose a versatile framework that fuses appearance based semi-supervision, mask based adversarial domain adaptation, and pseudo-labeling.
CHASe can further improve pathological liver mask Dice-Sorensen coefficients by ranges of $4.2% sim 9.4%$.
arXiv Detail & Related papers (2020-05-27T06:58:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.