Related papers: Adapting Foundation Model for Dental Caries Detection with Dual-View Co-Training

Adapting Foundation Model for Dental Caries Detection with Dual-View Co-Training

URL: http://arxiv.org/abs/2508.20813v1
Date: Thu, 28 Aug 2025 14:13:26 GMT
Title: Adapting Foundation Model for Dental Caries Detection with Dual-View Co-Training
Authors: Tao Luo, Han Wu, Tong Yang, Dinggang Shen, Zhiming Cui,
Abstract summary: We present Attention-TNet, a novel Dual-View Co-Training network for accurate dental caries detection.<n>OurTNet starts with employing automated tooth detection to establish two complementary views: a global view from panoramic X-ray images and a local view from cropped tooth images.<n>To effectively integrate information from both views, we introduce a Gated Cross-View module.
Score: 53.77904429789069
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Accurate dental caries detection from panoramic X-rays plays a pivotal role in preventing lesion progression. However, current detection methods often yield suboptimal accuracy due to subtle contrast variations and diverse lesion morphology of dental caries. In this work, inspired by the clinical workflow where dentists systematically combine whole-image screening with detailed tooth-level inspection, we present DVCTNet, a novel Dual-View Co-Training network for accurate dental caries detection. Our DVCTNet starts with employing automated tooth detection to establish two complementary views: a global view from panoramic X-ray images and a local view from cropped tooth images. We then pretrain two vision foundation models separately on the two views. The global-view foundation model serves as the detection backbone, generating region proposals and global features, while the local-view model extracts detailed features from corresponding cropped tooth patches matched by the region proposals. To effectively integrate information from both views, we introduce a Gated Cross-View Attention (GCV-Atten) module that dynamically fuses dual-view features, enhancing the detection pipeline by integrating the fused features back into the detection model for final caries detection. To rigorously evaluate our DVCTNet, we test it on a public dataset and further validate its performance on a newly curated, high-precision dental caries detection dataset, annotated using both intra-oral images and panoramic X-rays for double verification. Experimental results demonstrate DVCTNet's superior performance against existing state-of-the-art (SOTA) methods on both datasets, indicating the clinical applicability of our method. Our code and labeled dataset are available at https://github.com/ShanghaiTech-IMPACT/DVCTNet.

Related papers

DentalX: Context-Aware Dental Disease Detection with Radiographs [44.3806898357896]
Diagnosing dental diseases from radiographs is time-consuming and challenging due to the subtle nature of diagnostic evidence.<n>Existing methods, which rely on object detection models, struggle to detect dental diseases that present with far less visual support.<n>We propose bf DentalX, a novel context-aware dental disease detection approach.
arXiv Detail & Related papers (2026-01-13T18:32:28Z)
Tooth-Diffusion: Guided 3D CBCT Synthesis with Fine-Grained Tooth Conditioning [0.0]
We propose a conditional diffusion framework for 3D dental volume generation guided by tooth-level binary attributes.<n>Our approach integrates wavelet-based denoising diffusion, FiLM conditioning, and masked loss functions to focus learning on relevant anatomical structures.<n>Results show strong fidelity and generalization with low FID scores, robust inpainting performance, and SSIM values above 0.91 even on unseen scans.
arXiv Detail & Related papers (2025-08-19T21:21:35Z)
Dual-Image Enhanced CLIP for Zero-Shot Anomaly Detection [58.228940066769596]
We introduce a Dual-Image Enhanced CLIP approach, leveraging a joint vision-language scoring system. Our methods process pairs of images, utilizing each as a visual reference for the other, thereby enriching the inference process with visual context. Our approach significantly exploits the potential of vision-language joint anomaly detection and demonstrates comparable performance with current SOTA methods across various datasets.
arXiv Detail & Related papers (2024-05-08T03:13:20Z)
Real-time guidewire tracking and segmentation in intraoperative x-ray [52.51797358201872]
We propose a two-stage deep learning framework for real-time guidewire segmentation and tracking. In the first stage, a Yolov5 detector is trained, using the original X-ray images as well as synthetic ones, to output the bounding boxes of possible target guidewires. In the second stage, a novel and efficient network is proposed to segment the guidewire in each detected bounding box.
arXiv Detail & Related papers (2024-04-12T20:39:19Z)
Multiclass Segmentation using Teeth Attention Modules for Dental X-ray Images [8.041659727964305]
We propose a novel teeth segmentation model incorporating an M-Net-like structure with Swin Transformers and TAB. The proposed TAB utilizes a unique attention mechanism that focuses specifically on the complex structures of teeth. The proposed architecture effectively captures local and global contextual information, accurately defining each tooth and its surrounding structures.
arXiv Detail & Related papers (2023-11-07T06:20:34Z)
Improving Classification Model Performance on Chest X-Rays through Lung Segmentation [63.45024974079371]
We propose a deep learning approach to enhance abnormal chest x-ray (CXR) identification performance through segmentations. Our approach is designed in a cascaded manner and incorporates two modules: a deep neural network with criss-cross attention modules (XLSor) for localizing lung region in CXR images and a CXR classification model with a backbone of a self-supervised momentum contrast (MoCo) model pre-trained on large-scale CXR data sets.
arXiv Detail & Related papers (2022-02-22T15:24:06Z)
InDuDoNet+: A Model-Driven Interpretable Dual Domain Network for Metal Artifact Reduction in CT Images [53.4351366246531]
We construct a novel interpretable dual domain network, termed InDuDoNet+, into which CT imaging process is finely embedded. We analyze the CT values among different tissues, and merge the prior observations into a prior network for our InDuDoNet+, which significantly improve its generalization performance.
arXiv Detail & Related papers (2021-12-23T15:52:37Z)
Two-Stage Mesh Deep Learning for Automated Tooth Segmentation and Landmark Localization on 3D Intraoral Scans [56.55092443401416]
emphiMeshSegNet in the first stage of TS-MDL reached an averaged Dice similarity coefficient (DSC) at 0.953pm0.076$, significantly outperforming the original MeshSegNet. PointNet-Reg achieved a mean absolute error (MAE) of $0.623pm0.718, mm$ in distances between the prediction and ground truth for $44$ landmarks, which is superior compared with other networks for landmark detection.
arXiv Detail & Related papers (2021-09-24T13:00:26Z)
SERV-CT: A disparity dataset from CT for validation of endoscopic 3D reconstruction [8.448866668577946]
We present a stereo-endoscopic reconstruction validation dataset based on CT (SERV-CT) The SERV-CT dataset provides an easy to use stereoscopic validation for surgical applications with smooth reference disparities and depths with coverage over the majority of the endoscopic images.
arXiv Detail & Related papers (2020-12-22T01:28:30Z)
An Adaptive Enhancement Based Hybrid CNN Model for Digital Dental X-ray Positions Classification [1.0672152844970149]
A novel solution based on adaptive histogram equalization and convolution neural network (CNN) is proposed. The accuracy and specificity of the test set exceeded 90%, and the AUC reached 0.97.
arXiv Detail & Related papers (2020-05-01T13:55:44Z)
Pose-Aware Instance Segmentation Framework from Cone Beam CT Images for Tooth Segmentation [9.880428545498662]
Individual tooth segmentation from cone beam computed tomography (CBCT) images is essential for an anatomical understanding of orthodontic structures. The presence of severe metal artifacts in CBCT images hinders the accurate segmentation of each individual tooth. We propose a neural network for pixel-wise labeling to exploit an instance segmentation framework that is robust to metal artifacts.
arXiv Detail & Related papers (2020-02-06T07:57:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.