Cluster-based human-in-the-loop strategy for improving machine learning-based circulating tumor cell detection in liquid biopsy
- URL: http://arxiv.org/abs/2411.16332v1
- Date: Mon, 25 Nov 2024 12:26:48 GMT
- Title: Cluster-based human-in-the-loop strategy for improving machine learning-based circulating tumor cell detection in liquid biopsy
- Authors: Hümeyra Husseini-Wüsthoff, Sabine Riethdorf, Andreas Schneeweiss, Andreas Trumpp, Klaus Pantel, Harriet Wikman, Maximilian Nielsen, René Werner,
- Abstract summary: This study introduces a human-in-the-loop (HiL) strategy for improving machine learning-based CTC detection.
We combine self-supervised deep learning and a conventional ML-based classifier and propose iterative targeted sampling and labeling of new unlabeled training samples by human experts.
The advantages of the proposed approach are demonstrated for liquid biopsy data from patients with metastatic breast cancer.
- Score: 0.0
- License:
- Abstract: Detection and differentiation of circulating tumor cells (CTCs) and non-CTCs in blood draws of cancer patients pose multiple challenges. While the gold standard relies on tedious manual evaluation of an automatically generated selection of images, machine learning (ML) techniques offer the potential to automate these processes. However, human assessment remains indispensable when the ML system arrives at uncertain or wrong decisions due to an insufficient set of labeled training data. This study introduces a human-in-the-loop (HiL) strategy for improving ML-based CTC detection. We combine self-supervised deep learning and a conventional ML-based classifier and propose iterative targeted sampling and labeling of new unlabeled training samples by human experts. The sampling strategy is based on the classification performance of local latent space clusters. The advantages of the proposed approach compared to naive random sampling are demonstrated for liquid biopsy data from patients with metastatic breast cancer.
Related papers
- Renal Cell Carcinoma subtyping: learning from multi-resolution localization [1.5728609542259502]
This study investigates a novel self supervised training strategy for machine learning diagnostic tools.
We aim at reducing the need of annotated dataset, without significantly reducing the accuracy of the tool.
We demonstrate the classification capability of our tool on a whole slide imaging dataset for Renal Cancer subtyping, and we compare our solution with several state-of-the-art classification counterparts.
arXiv Detail & Related papers (2024-11-14T14:21:49Z) - MMIL: A novel algorithm for disease associated cell type discovery [58.044870442206914]
Single-cell datasets often lack individual cell labels, making it challenging to identify cells associated with disease.
We introduce Mixture Modeling for Multiple Learning Instance (MMIL), an expectation method that enables the training and calibration of cell-level classifiers.
arXiv Detail & Related papers (2024-06-12T15:22:56Z) - An interpretable machine learning system for colorectal cancer diagnosis from pathology slides [2.7968867060319735]
This study is conducted with one of the largest WSI colorectal samples dataset with approximately 10,500 WSIs.
Our proposed method predicts, for the patch-based tiles, a class based on the severity of the dysplasia.
It is trained with an interpretable mixed-supervision scheme to leverage the domain knowledge introduced by pathologists.
arXiv Detail & Related papers (2023-01-06T17:10:32Z) - Multi-Scale Hybrid Vision Transformer for Learning Gastric Histology:
AI-Based Decision Support System for Gastric Cancer Treatment [50.89811515036067]
Gastric endoscopic screening is an effective way to decide appropriate gastric cancer (GC) treatment at an early stage, reducing GC-associated mortality rate.
We propose a practical AI system that enables five subclassifications of GC pathology, which can be directly matched to general GC treatment guidance.
arXiv Detail & Related papers (2022-02-17T08:33:52Z) - Oral cancer detection and interpretation: Deep multiple instance
learning versus conventional deep single instance learning [2.2612425542955292]
Current medical standard for setting an oral cancer (OC) diagnosis is histological examination of a tissue sample from the oral cavity.
To introduce this approach into clinical routine is associated with challenges such as a lack of experts and labour-intensive work.
We are interested in AI-based methods that reliably can detect cancer given only per-patient labels.
arXiv Detail & Related papers (2022-02-03T15:04:26Z) - Open-Set Recognition of Breast Cancer Treatments [91.3247063132127]
Open-set recognition generalizes a classification task by classifying test samples as one of the known classes from training or "unknown"
We apply a recent existing Gaussian mixture variational autoencoder model, which achieves state-of-the-art results for image datasets, to breast cancer patient data.
Not only do we obtain more accurate and robust classification results, with a 24.5% average F1 increase compared to a recent method, but we also reexamine open-set recognition in terms of deployability to a clinical setting.
arXiv Detail & Related papers (2022-01-09T04:35:55Z) - Lung Cancer Lesion Detection in Histopathology Images Using Graph-Based
Sparse PCA Network [93.22587316229954]
We propose a graph-based sparse principal component analysis (GS-PCA) network, for automated detection of cancerous lesions on histological lung slides stained by hematoxylin and eosin (H&E)
We evaluate the performance of the proposed algorithm on H&E slides obtained from an SVM K-rasG12D lung cancer mouse model using precision/recall rates, F-score, Tanimoto coefficient, and area under the curve (AUC) of the receiver operator characteristic (ROC)
arXiv Detail & Related papers (2021-10-27T19:28:36Z) - Ensemble of CNN classifiers using Sugeno Fuzzy Integral Technique for
Cervical Cytology Image Classification [1.6986898305640261]
We propose a fully automated computer-aided diagnosis tool for classifying single-cell and slide images of cervical cancer.
We use the Sugeno Fuzzy Integral to ensemble the decision scores from three popular deep learning models, namely, Inception v3, DenseNet-161 and ResNet-34.
arXiv Detail & Related papers (2021-08-21T08:41:41Z) - A Role for Prior Knowledge in Statistical Classification of the
Transition from MCI to Alzheimer's Disease [0.0]
The transition from mild cognitive impairment (MCI) to Alzheimer's disease (AD) is of great interest to clinical researchers.
The growth of machine learning (ML) approaches for classification may falsely lead many clinical researchers to underestimate the value of logistic regression (LR)
We propose an alternative pre-selection technique that utilizes an efficient feature selection based on clinical knowledge of brain regions involved in AD.
arXiv Detail & Related papers (2020-11-28T18:15:24Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.