Related papers: Outlier Detection using Self-Organizing Maps for Automated Blood Cell Analysis

Outlier Detection using Self-Organizing Maps for Automated Blood Cell Analysis

URL: http://arxiv.org/abs/2208.08834v1
Date: Thu, 18 Aug 2022 13:53:27 GMT
Title: Outlier Detection using Self-Organizing Maps for Automated Blood Cell Analysis
Authors: Stefan R\"ohrl, Alice Hein, Lucie Huang, Dominik Heim, Christian Klenk, Manuel Lengl, Martin Knopp, Nawal Hafez, Oliver Hayden, Klaus Diepold
Abstract summary: In this work, we assess the suitability of Self-Organizing Maps for outlier detection on a medical dataset. We detect and evaluate outliers based on quantization errors and distance maps. Self-Organizing Maps perform on par with a manually specified filter based on expert domain knowledge.
Score: 1.2189422792863451
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The quality of datasets plays a crucial role in the successful training and deployment of deep learning models. Especially in the medical field, where system performance may impact the health of patients, clean datasets are a safety requirement for reliable predictions. Therefore, outlier detection is an essential process when building autonomous clinical decision systems. In this work, we assess the suitability of Self-Organizing Maps for outlier detection specifically on a medical dataset containing quantitative phase images of white blood cells. We detect and evaluate outliers based on quantization errors and distance maps. Our findings confirm the suitability of Self-Organizing Maps for unsupervised Out-Of-Distribution detection on the dataset at hand. Self-Organizing Maps perform on par with a manually specified filter based on expert domain knowledge. Additionally, they show promise as a tool in the exploration and cleaning of medical datasets. As a direction for future research, we suggest a combination of Self-Organizing Maps and feature extraction based on deep learning.

Related papers

Detecting Dataset Bias in Medical AI: A Generalized and Modality-Agnostic Auditing Framework [8.520644988801243]
latent bias in machine learning datasets can be amplified during training and/or hidden during testing. We present a data modality-agnostic auditing framework for generating targeted hypotheses about sources of bias. We demonstrate the broad applicability and value of our method by analyzing large-scale medical datasets.
arXiv Detail & Related papers (2025-03-13T02:16:48Z)
Cross-platform Prediction of Depression Treatment Outcome Using Location Sensory Data on Smartphones [55.992010576087424]
We explore using location sensory data collected passively on smartphones to predict treatment outcome. Our results show that using location features and baseline self-reported questionnaire score can lead to F1 score up to 0.67, comparable to that obtained using periodic self-reported questionnaires.
arXiv Detail & Related papers (2025-03-10T22:00:07Z)
A Comprehensive Dataset and Automated Pipeline for Nailfold Capillary Analysis [24.8934927577986]
We present a pioneering effort in constructing a comprehensive nailfold capillary dataset-321 images, 219 videos from 68 subjects, with clinic reports and expert annotations. We finetuned three deep learning models with expert annotations as supervised labels and integrated them into a novel end-to-end nailfold capillary analysis pipeline. Experiment results showed that our automated pipeline achieves an average of sub-pixel level precision in measurements and 89.9% accuracy in identifying morphological abnormalities.
arXiv Detail & Related papers (2023-12-10T16:33:41Z)
Towards Unifying Anatomy Segmentation: Automated Generation of a Full-body CT Dataset via Knowledge Aggregation and Anatomical Guidelines [113.08940153125616]
We generate a dataset of whole-body CT scans with $142$ voxel-level labels for 533 volumes providing comprehensive anatomical coverage. Our proposed procedure does not rely on manual annotation during the label aggregation stage. We release our trained unified anatomical segmentation model capable of predicting $142$ anatomical structures on CT data.
arXiv Detail & Related papers (2023-07-25T09:48:13Z)
Clinically Acceptable Segmentation of Organs at Risk in Cervical Cancer Radiation Treatment from Clinically Available Annotations [0.0]
We present an approach to learn a deep learning model for the automatic segmentation of Organs at Risk (OARs) in cervical cancer radiation treatment. We employ simples for automatic data cleaning to minimize data inhomogeneity, label noise, and missing annotations. We develop a semi-supervised learning approach utilizing a teacher-student setup, annotation imputation, and uncertainty-guided training to learn in presence of missing annotations.
arXiv Detail & Related papers (2023-02-21T13:24:40Z)
Unsupervised Cross-Domain Feature Extraction for Single Blood Cell Image Classification [37.90158669637884]
Autoencoder is based on an R-CNN architecture allowing it to focus on the relevant white blood cell and eliminate artifacts in the image. We show that thanks to the rich features extracted by the autoencoder trained on only one of the datasets, the random forest classifier performs satisfactorily on the unseen datasets. Our results suggest the possibility of employing this unsupervised approach in more complicated diagnosis and prognosis tasks without the need to add expensive expert labels to unseen data.
arXiv Detail & Related papers (2022-07-01T15:44:42Z)
Self-Supervised Learning as a Means To Reduce the Need for Labeled Data in Medical Image Analysis [64.4093648042484]
We use a dataset of chest X-ray images with bounding box labels for 13 different classes of anomalies. We show that it is possible to achieve similar performance to a fully supervised model in terms of mean average precision and accuracy with only 60% of the labeled data.
arXiv Detail & Related papers (2022-06-01T09:20:30Z)
Chest x-ray automated triage: a semiologic approach designed for clinical implementation, exploiting different types of labels through a combination of four Deep Learning architectures [83.48996461770017]
This work presents a Deep Learning method based on the late fusion of different convolutional architectures. We built four training datasets combining images from public chest x-ray datasets and our institutional archive. We trained four different Deep Learning architectures and combined their outputs with a late fusion strategy, obtaining a unified tool.
arXiv Detail & Related papers (2020-12-23T14:38:35Z)
Semi-Automatic Data Annotation guided by Feature Space Projection [117.9296191012968]
We present a semi-automatic data annotation approach based on suitable feature space projection and semi-supervised label estimation. We validate our method on the popular MNIST dataset and on images of human intestinal parasites with and without fecal impurities. Our results demonstrate the added-value of visual analytics tools that combine complementary abilities of humans and machines for more effective machine learning.
arXiv Detail & Related papers (2020-07-27T17:03:50Z)
Trajectories, bifurcations and pseudotime in large clinical datasets: applications to myocardial infarction and diabetes data [94.37521840642141]
We suggest a semi-supervised methodology for the analysis of large clinical datasets, characterized by mixed data types and missing values. The methodology is based on application of elastic principal graphs which can address simultaneously the tasks of dimensionality reduction, data visualization, clustering, feature selection and quantifying the geodesic distances (pseudotime) in partially ordered sequences of observations.
arXiv Detail & Related papers (2020-07-07T21:04:55Z)
An Extensive Study on Cross-Dataset Bias and Evaluation Metrics Interpretation for Machine Learning applied to Gastrointestinal Tract Abnormality Classification [2.985964157078619]
Automatic analysis of diseases in the GI tract is a hot topic in computer science and medical-related journals. A clear understanding of evaluation metrics and machine learning models with cross datasets is crucial to bring research in the field to a new quality level. We present comprehensive evaluations of five distinct machine learning models that can classify 16 different GI tract conditions.
arXiv Detail & Related papers (2020-05-08T08:59:31Z)
Self-Training with Improved Regularization for Sample-Efficient Chest X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios. Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.