Weakly Supervised Object Detection for Automatic Tooth-marked Tongue Recognition
- URL: http://arxiv.org/abs/2408.16451v1
- Date: Thu, 29 Aug 2024 11:31:28 GMT
- Title: Weakly Supervised Object Detection for Automatic Tooth-marked Tongue Recognition
- Authors: Yongcun Zhang, Jiajun Xu, Yina He, Shaozi Li, Zhiming Luo, Huangwei Lei,
- Abstract summary: Tongue diagnosis in Traditional Chinese Medicine (TCM) is a crucial diagnostic method that can reflect an individual's health status.
Traditional methods for identifying tooth-marked tongues are subjective and inconsistent because they rely on practitioner experience.
We propose a novel fully automated WeaklySupervised method using Vision transformer and Multiple instance learning WSVM for tongue extraction and tooth-marked tongue recognition.
- Score: 19.34036038278796
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tongue diagnosis in Traditional Chinese Medicine (TCM) is a crucial diagnostic method that can reflect an individual's health status. Traditional methods for identifying tooth-marked tongues are subjective and inconsistent because they rely on practitioner experience. We propose a novel fully automated Weakly Supervised method using Vision transformer and Multiple instance learning WSVM for tongue extraction and tooth-marked tongue recognition. Our approach first accurately detects and extracts the tongue region from clinical images, removing any irrelevant background information. Then, we implement an end-to-end weakly supervised object detection method. We utilize Vision Transformer (ViT) to process tongue images in patches and employ multiple instance loss to identify tooth-marked regions with only image-level annotations. WSVM achieves high accuracy in tooth-marked tongue classification, and visualization experiments demonstrate its effectiveness in pinpointing these regions. This automated approach enhances the objectivity and accuracy of tooth-marked tongue diagnosis. It provides significant clinical value by assisting TCM practitioners in making precise diagnoses and treatment recommendations. Code is available at https://github.com/yc-zh/WSVM.
Related papers
- Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection.
Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels.
Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z) - Ammonia-Net: A Multi-task Joint Learning Model for Multi-class
Segmentation and Classification in Tooth-marked Tongue Diagnosis [12.095100353695038]
In Traditional Chinese Medicine, the tooth marks on the tongue serve as a crucial indicator for assessing qi (yang) deficiency.
To address these problems, we propose a multi-task joint learning model named Ammonia-Net.
Ammonia-Net performs semantic segmentation of tongue images to identify tongue and tooth marks.
arXiv Detail & Related papers (2023-10-05T11:28:32Z) - Self-Supervised Learning with Masked Image Modeling for Teeth Numbering,
Detection of Dental Restorations, and Instance Segmentation in Dental
Panoramic Radiographs [8.397847537464534]
This study aims to utilize recent self-supervised learning methods like SimMIM and UM-MAE to increase the model efficiency and understanding of the limited number of dental radiographs.
To the best of our knowledge, this is the first study that applied self-supervised learning methods to Swin Transformer on dental panoramic radiographs.
arXiv Detail & Related papers (2022-10-20T16:50:07Z) - Data-Efficient Vision Transformers for Multi-Label Disease
Classification on Chest Radiographs [55.78588835407174]
Vision Transformers (ViTs) have not been applied to this task despite their high classification performance on generic images.
ViTs do not rely on convolutions but on patch-based self-attention and in contrast to CNNs, no prior knowledge of local connectivity is present.
Our results show that while the performance between ViTs and CNNs is on par with a small benefit for ViTs, DeiTs outperform the former if a reasonably large data set is available for training.
arXiv Detail & Related papers (2022-08-17T09:07:45Z) - Few-Shot Cross-lingual Transfer for Coarse-grained De-identification of
Code-Mixed Clinical Texts [56.72488923420374]
Pre-trained language models (LMs) have shown great potential for cross-lingual transfer in low-resource settings.
We show the few-shot cross-lingual transfer property of LMs for named recognition (NER) and apply it to solve a low-resource and real-world challenge of code-mixed (Spanish-Catalan) clinical notes de-identification in the stroke.
arXiv Detail & Related papers (2022-04-10T21:46:52Z) - Two-Stage Mesh Deep Learning for Automated Tooth Segmentation and
Landmark Localization on 3D Intraoral Scans [56.55092443401416]
emphiMeshSegNet in the first stage of TS-MDL reached an averaged Dice similarity coefficient (DSC) at 0.953pm0.076$, significantly outperforming the original MeshSegNet.
PointNet-Reg achieved a mean absolute error (MAE) of $0.623pm0.718, mm$ in distances between the prediction and ground truth for $44$ landmarks, which is superior compared with other networks for landmark detection.
arXiv Detail & Related papers (2021-09-24T13:00:26Z) - CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark [51.38557174322772]
We present the first Chinese Biomedical Language Understanding Evaluation benchmark.
It is a collection of natural language understanding tasks including named entity recognition, information extraction, clinical diagnosis normalization, single-sentence/sentence-pair classification.
We report empirical results with the current 11 pre-trained Chinese models, and experimental results show that state-of-the-art neural models perform by far worse than the human ceiling.
arXiv Detail & Related papers (2021-06-15T12:25:30Z) - Discriminative Nearest Neighbor Few-Shot Intent Detection by
Transferring Natural Language Inference [150.07326223077405]
Few-shot learning is attracting much attention to mitigate data scarcity.
We present a discriminative nearest neighbor classification with deep self-attention.
We propose to boost the discriminative ability by transferring a natural language inference (NLI) model.
arXiv Detail & Related papers (2020-10-25T00:39:32Z) - An Adaptive Enhancement Based Hybrid CNN Model for Digital Dental X-ray
Positions Classification [1.0672152844970149]
A novel solution based on adaptive histogram equalization and convolution neural network (CNN) is proposed.
The accuracy and specificity of the test set exceeded 90%, and the AUC reached 0.97.
arXiv Detail & Related papers (2020-05-01T13:55:44Z) - Individual Tooth Detection and Identification from Dental Panoramic
X-Ray Images via Point-wise Localization and Distance Regularization [10.877276642014515]
The proposed network initially performs center point regression for all the anatomical teeth, which automatically identifies each tooth.
Teeth boxes are individually localized using a cascaded neural network on a patch basis.
The experimental results demonstrate that the proposed algorithm outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2020-04-12T04:14:14Z) - Deep Learning for Automatic Tracking of Tongue Surface in Real-time
Ultrasound Videos, Landmarks instead of Contours [0.6853165736531939]
This paper presents a new novel approach of automatic and real-time tongue contour tracking using deep neural networks.
In the proposed method, instead of the two-step procedure, landmarks of the tongue surface are tracked.
Our experiment disclosed the outstanding performances of the proposed technique in terms of generalization, performance, and accuracy.
arXiv Detail & Related papers (2020-03-16T00:38:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.