Related papers: Weakly Supervised Object Detection for Automatic Tooth-marked Tongue Recognition

Weakly Supervised Object Detection for Automatic Tooth-marked Tongue Recognition

URL: http://arxiv.org/abs/2408.16451v1
Date: Thu, 29 Aug 2024 11:31:28 GMT
Title: Weakly Supervised Object Detection for Automatic Tooth-marked Tongue Recognition
Authors: Yongcun Zhang, Jiajun Xu, Yina He, Shaozi Li, Zhiming Luo, Huangwei Lei,
Abstract summary: Tongue diagnosis in Traditional Chinese Medicine (TCM) is a crucial diagnostic method that can reflect an individual's health status. Traditional methods for identifying tooth-marked tongues are subjective and inconsistent because they rely on practitioner experience. We propose a novel fully automated WeaklySupervised method using Vision transformer and Multiple instance learning WSVM for tongue extraction and tooth-marked tongue recognition.
Score: 19.34036038278796
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Tongue diagnosis in Traditional Chinese Medicine (TCM) is a crucial diagnostic method that can reflect an individual's health status. Traditional methods for identifying tooth-marked tongues are subjective and inconsistent because they rely on practitioner experience. We propose a novel fully automated Weakly Supervised method using Vision transformer and Multiple instance learning WSVM for tongue extraction and tooth-marked tongue recognition. Our approach first accurately detects and extracts the tongue region from clinical images, removing any irrelevant background information. Then, we implement an end-to-end weakly supervised object detection method. We utilize Vision Transformer (ViT) to process tongue images in patches and employ multiple instance loss to identify tooth-marked regions with only image-level annotations. WSVM achieves high accuracy in tooth-marked tongue classification, and visualization experiments demonstrate its effectiveness in pinpointing these regions. This automated approach enhances the objectivity and accuracy of tooth-marked tongue diagnosis. It provides significant clinical value by assisting TCM practitioners in making precise diagnoses and treatment recommendations. Code is available at https://github.com/yc-zh/WSVM.

Related papers

Dr. Tongue: Sign-Oriented Multi-label Detection for Remote Tongue Diagnosis [28.941196046110175]
Tongue diagnosis is a vital tool in Western and Traditional Chinese Medicine. The COVID-19 pandemic has heightened the need for accurate remote medical assessments. We propose a Sign-Oriented multi-label Attributes Detection framework.
arXiv Detail & Related papers (2025-01-06T14:40:45Z)
Devising a Set of Compact and Explainable Spoken Language Feature for Screening Alzheimer's Disease [52.46922921214341]
Alzheimer's disease (AD) has become one of the most significant health challenges in an aging society. We devised an explainable and effective feature set that leverages the visual capabilities of a large language model (LLM) and the Term Frequency-Inverse Document Frequency (TF-IDF) model. Our new features can be well explained and interpreted step by step which enhance the interpretability of automatic AD screening.
arXiv Detail & Related papers (2024-11-28T05:23:22Z)
Uncertainty-aware Medical Diagnostic Phrase Identification and Grounding [72.18719355481052]
We introduce a novel task called Medical Report Grounding (MRG)<n>MRG aims to directly identify diagnostic phrases and their corresponding grounding boxes from medical reports in an end-to-end manner.<n>We propose uMedGround, a robust and reliable framework that leverages a multimodal large language model to predict diagnostic phrases.
arXiv Detail & Related papers (2024-04-10T07:41:35Z)
Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection. Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels. Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z)
Ammonia-Net: A Multi-task Joint Learning Model for Multi-class Segmentation and Classification in Tooth-marked Tongue Diagnosis [12.095100353695038]
In Traditional Chinese Medicine, the tooth marks on the tongue serve as a crucial indicator for assessing qi (yang) deficiency. To address these problems, we propose a multi-task joint learning model named Ammonia-Net. Ammonia-Net performs semantic segmentation of tongue images to identify tongue and tooth marks.
arXiv Detail & Related papers (2023-10-05T11:28:32Z)
Self-Supervised Learning with Masked Image Modeling for Teeth Numbering, Detection of Dental Restorations, and Instance Segmentation in Dental Panoramic Radiographs [8.397847537464534]
This study aims to utilize recent self-supervised learning methods like SimMIM and UM-MAE to increase the model efficiency and understanding of the limited number of dental radiographs. To the best of our knowledge, this is the first study that applied self-supervised learning methods to Swin Transformer on dental panoramic radiographs.
arXiv Detail & Related papers (2022-10-20T16:50:07Z)
Data-Efficient Vision Transformers for Multi-Label Disease Classification on Chest Radiographs [55.78588835407174]
Vision Transformers (ViTs) have not been applied to this task despite their high classification performance on generic images. ViTs do not rely on convolutions but on patch-based self-attention and in contrast to CNNs, no prior knowledge of local connectivity is present. Our results show that while the performance between ViTs and CNNs is on par with a small benefit for ViTs, DeiTs outperform the former if a reasonably large data set is available for training.
arXiv Detail & Related papers (2022-08-17T09:07:45Z)
Few-Shot Cross-lingual Transfer for Coarse-grained De-identification of Code-Mixed Clinical Texts [56.72488923420374]
Pre-trained language models (LMs) have shown great potential for cross-lingual transfer in low-resource settings. We show the few-shot cross-lingual transfer property of LMs for named recognition (NER) and apply it to solve a low-resource and real-world challenge of code-mixed (Spanish-Catalan) clinical notes de-identification in the stroke.
arXiv Detail & Related papers (2022-04-10T21:46:52Z)
Two-Stage Mesh Deep Learning for Automated Tooth Segmentation and Landmark Localization on 3D Intraoral Scans [56.55092443401416]
emphiMeshSegNet in the first stage of TS-MDL reached an averaged Dice similarity coefficient (DSC) at 0.953pm0.076$, significantly outperforming the original MeshSegNet. PointNet-Reg achieved a mean absolute error (MAE) of $0.623pm0.718, mm$ in distances between the prediction and ground truth for $44$ landmarks, which is superior compared with other networks for landmark detection.
arXiv Detail & Related papers (2021-09-24T13:00:26Z)
CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark [51.38557174322772]
We present the first Chinese Biomedical Language Understanding Evaluation benchmark. It is a collection of natural language understanding tasks including named entity recognition, information extraction, clinical diagnosis normalization, single-sentence/sentence-pair classification. We report empirical results with the current 11 pre-trained Chinese models, and experimental results show that state-of-the-art neural models perform by far worse than the human ceiling.
arXiv Detail & Related papers (2021-06-15T12:25:30Z)
Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference [150.07326223077405]
Few-shot learning is attracting much attention to mitigate data scarcity. We present a discriminative nearest neighbor classification with deep self-attention. We propose to boost the discriminative ability by transferring a natural language inference (NLI) model.
arXiv Detail & Related papers (2020-10-25T00:39:32Z)
An Adaptive Enhancement Based Hybrid CNN Model for Digital Dental X-ray Positions Classification [1.0672152844970149]
A novel solution based on adaptive histogram equalization and convolution neural network (CNN) is proposed. The accuracy and specificity of the test set exceeded 90%, and the AUC reached 0.97.
arXiv Detail & Related papers (2020-05-01T13:55:44Z)
Individual Tooth Detection and Identification from Dental Panoramic X-Ray Images via Point-wise Localization and Distance Regularization [10.877276642014515]
The proposed network initially performs center point regression for all the anatomical teeth, which automatically identifies each tooth. Teeth boxes are individually localized using a cascaded neural network on a patch basis. The experimental results demonstrate that the proposed algorithm outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2020-04-12T04:14:14Z)
Deep Learning for Automatic Tracking of Tongue Surface in Real-time Ultrasound Videos, Landmarks instead of Contours [0.6853165736531939]
This paper presents a new novel approach of automatic and real-time tongue contour tracking using deep neural networks. In the proposed method, instead of the two-step procedure, landmarks of the tongue surface are tracked. Our experiment disclosed the outstanding performances of the proposed technique in terms of generalization, performance, and accuracy.
arXiv Detail & Related papers (2020-03-16T00:38:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.