Related papers: Quantifying the Effect of Image Similarity on Diabetic Foot Ulcer Classification

Quantifying the Effect of Image Similarity on Diabetic Foot Ulcer Classification

URL: http://arxiv.org/abs/2304.12987v1
Date: Tue, 25 Apr 2023 16:54:27 GMT
Title: Quantifying the Effect of Image Similarity on Diabetic Foot Ulcer Classification
Authors: Imran Chowdhury Dipto, Bill Cassidy, Connah Kendrick, Neil D. Reeves, Joseph M. Pappachan, Vishnu Chandrabalan, Moi Hoon Yap
Abstract summary: This research conducts an investigation on the effect of visually similar images within a publicly available diabetic foot ulcer dataset when training deep learning classification networks. The presence of binary-identical duplicate images in datasets used to train deep learning algorithms can introduce unwanted bias which can degrade network performance. We use an open-source fuzzy algorithm to identify groups of increasingly similar images in the Diabetic Foot Ulcers Challenge 2021 training dataset.
Score: 4.318783737552881
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This research conducts an investigation on the effect of visually similar images within a publicly available diabetic foot ulcer dataset when training deep learning classification networks. The presence of binary-identical duplicate images in datasets used to train deep learning algorithms is a well known issue that can introduce unwanted bias which can degrade network performance. However, the effect of visually similar non-identical images is an under-researched topic, and has so far not been investigated in any diabetic foot ulcer studies. We use an open-source fuzzy algorithm to identify groups of increasingly similar images in the Diabetic Foot Ulcers Challenge 2021 (DFUC2021) training dataset. Based on each similarity threshold, we create new training sets that we use to train a range of deep learning multi-class classifiers. We then evaluate the performance of the best performing model on the DFUC2021 test set. Our findings show that the model trained on the training set with the 80\% similarity threshold images removed achieved the best performance using the InceptionResNetV2 network. This model showed improvements in F1-score, precision, and recall of 0.023, 0.029, and 0.013, respectively. These results indicate that highly similar images can contribute towards the presence of performance degrading bias within the Diabetic Foot Ulcers Challenge 2021 dataset, and that the removal of images that are 80\% similar from the training set can help to boost classification performance.

Related papers

Additional Look into GAN-based Augmentation for Deep Learning COVID-19 Image Classification [57.1795052451257]
We study the dependence of the GAN-based augmentation performance on dataset size with a focus on small samples. We train StyleGAN2-ADA with both sets and then, after validating the quality of generated images, we use trained GANs as one of the augmentations approaches in multi-class classification problems. The GAN-based augmentation approach is found to be comparable with classical augmentation in the case of medium and large datasets but underperforms in the case of smaller datasets.
arXiv Detail & Related papers (2024-01-26T08:28:13Z)
Transfer-Ensemble Learning based Deep Convolutional Neural Networks for Diabetic Retinopathy Classification [0.7614628596146599]
This article aims to classify diabetic retinopathy (DR) disease into five different classes using an ensemble approach based on two popular pre-trained convolutional neural networks: VGG16 and Inception V3. Experimental results on the test set demonstrate the efficacy of the proposed ensemble model for DR classification achieving an accuracy of 96.4%.
arXiv Detail & Related papers (2023-08-01T13:07:39Z)
Performance of GAN-based augmentation for deep learning COVID-19 image classification [57.1795052451257]
The biggest challenge in the application of deep learning to the medical domain is the availability of training data. Data augmentation is a typical methodology used in machine learning when confronted with a limited data set. In this work, a StyleGAN2-ADA model of Generative Adversarial Networks is trained on the limited COVID-19 chest X-ray image set.
arXiv Detail & Related papers (2023-04-18T15:39:58Z)
An Ensemble Method to Automatically Grade Diabetic Retinopathy with Optical Coherence Tomography Angiography Images [4.640835690336653]
We propose an ensemble method to automatically grade Diabetic retinopathy (DR) images available from Diabetic Retinopathy Analysis Challenge (DRAC) 2022. First, we adopt the state-of-the-art classification networks, and train them to grade UW- OCTA images with different splits of the available dataset. Ultimately, we obtain 25 models, of which, the top 16 models are selected and ensembled to generate the final predictions.
arXiv Detail & Related papers (2022-12-12T22:06:47Z)
Development of an algorithm for medical image segmentation of bone tissue in interaction with metallic implants [58.720142291102135]
This study develops an algorithm for calculating bone growth in contact with metallic implants. Bone and implant tissue were manually segmented in the training data set. In terms of network accuracy, the model reached around 98%.
arXiv Detail & Related papers (2022-04-22T08:17:20Z)
Boosting EfficientNets Ensemble Performance via Pseudo-Labels and Synthetic Images by pix2pixHD for Infection and Ischaemia Classification in Diabetic Foot Ulcers [2.191505742658975]
This research investigates an approach on classification of infection and ischaemia conducted as part of the Diabetic Foot Ulcer Challenge (DFUC) 2021. The resulting extended training dataset features $8.68$ times the size of the baseline. Performances of models and ensembles trained on the baseline and extended training dataset are compared.
arXiv Detail & Related papers (2021-11-30T19:42:06Z)
FEDI: Few-shot learning based on Earth Mover's Distance algorithm combined with deep residual network to identify diabetic retinopathy [3.6623193507510012]
This paper proposes a few-shot learning model of a deep residual network based on Earth Mover's algorithm to assist in diagnosing diabetic retinopathy. We build training and validation classification tasks for few-shot learning based on 39 categories of 1000 sample data, train deep residual networks, and obtain experience pre-training models. Based on the weights of the pre-trained model, the Earth Mover's Distance algorithm calculates the distance between the images, obtains the similarity between the images, and changes the model's parameters to improve the accuracy of the training model.
arXiv Detail & Related papers (2021-08-22T13:05:02Z)
An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation [80.02124918255059]
Semi-supervised learning aims to boost the accuracy of a model by exploring unlabeled images. We learn two networks to mutually teach each other. The more reliable predictions on easy images in each network are used to teach the other network to learn about the corresponding hard images.
arXiv Detail & Related papers (2020-11-25T03:29:52Z)
Deep Learning in Diabetic Foot Ulcers Detection: A Comprehensive Evaluation [14.227261503586599]
This paper summarises the results of DFUC 2020 by comparing the deep learning-based algorithms proposed by the winning teams. The best performance was obtained from Deformable Convolution, a variant of Faster R-CNN, with a mean average precision (mAP) of 0.6940 and an F1-Score of 0.7434.
arXiv Detail & Related papers (2020-10-07T11:31:27Z)
Background Splitting: Finding Rare Classes in a Sea of Background [55.03789745276442]
We focus on the real-world problem of training accurate deep models for image classification of a small number of rare categories. In these scenarios, almost all images belong to the background category in the dataset (>95% of the dataset is background) We demonstrate that both standard fine-tuning approaches and state-of-the-art approaches for training on imbalanced datasets do not produce accurate deep models in the presence of this extreme imbalance.
arXiv Detail & Related papers (2020-08-28T23:05:15Z)
Multi-label Thoracic Disease Image Classification with Cross-Attention Networks [65.37531731899837]
We propose a novel scheme of Cross-Attention Networks (CAN) for automated thoracic disease classification from chest x-ray images. We also design a new loss function that beyond cross-entropy loss to help cross-attention process and is able to overcome the imbalance between classes and easy-dominated samples within each class.
arXiv Detail & Related papers (2020-07-21T14:37:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.