Reducing self-supervised learning complexity improves weakly-supervised
classification performance in computational pathology
- URL: http://arxiv.org/abs/2403.04558v2
- Date: Tue, 12 Mar 2024 11:42:06 GMT
- Title: Reducing self-supervised learning complexity improves weakly-supervised
classification performance in computational pathology
- Authors: Tim Lenz, Omar S. M. El Nahhas, Marta Ligero, Jakob Nikolas Kather
- Abstract summary: Self-supervised learning (SSL) methods allow for large-scale analyses on non-annotated data.
We investigated the complexity of SSL in relation to classification performance with the utilization of consumer-grade hardware.
Our experiments demonstrate that we can improve downstream classification performance whilst reducing SSL training duration by 90%.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Learning models have been successfully utilized to extract clinically
actionable insights from routinely available histology data. Generally, these
models require annotations performed by clinicians, which are scarce and costly
to generate. The emergence of self-supervised learning (SSL) methods remove
this barrier, allowing for large-scale analyses on non-annotated data. However,
recent SSL approaches apply increasingly expansive model architectures and
larger datasets, causing the rapid escalation of data volumes, hardware
prerequisites, and overall expenses, limiting access to these resources to few
institutions. Therefore, we investigated the complexity of contrastive SSL in
computational pathology in relation to classification performance with the
utilization of consumer-grade hardware. Specifically, we analyzed the effects
of adaptations in data volume, architecture, and algorithms on downstream
classification tasks, emphasizing their impact on computational resources. We
trained breast cancer foundation models on a large public patient cohort and
validated them on various downstream classification tasks in a weakly
supervised manner on two external public patient cohorts. Our experiments
demonstrate that we can improve downstream classification performance whilst
reducing SSL training duration by 90%. In summary, we propose a set of
adaptations which enable the utilization of SSL in computational pathology in
non-resource abundant environments.
Related papers
- A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification [51.35500308126506]
Self-supervised learning (SSL) is a machine learning approach where the data itself provides supervision, eliminating the need for external labels.
We study how classification-based evaluation protocols for SSL correlate and how well they predict downstream performance on different dataset types.
arXiv Detail & Related papers (2024-07-16T23:17:36Z) - Which Augmentation Should I Use? An Empirical Investigation of Augmentations for Self-Supervised Phonocardiogram Representation Learning [5.438725298163702]
Contrastive Self-Supervised Learning (SSL) offers a potential solution to labeled data scarcity.
We propose uncovering the optimal augmentations for applying contrastive learning in 1D phonocardiogram (PCG) classification.
We demonstrate that depending on its training distribution, the effectiveness of a fully-supervised model can degrade up to 32%, while SSL models only lose up to 10% or even improve in some cases.
arXiv Detail & Related papers (2023-12-01T11:06:00Z) - Improving Representation Learning for Histopathologic Images with
Cluster Constraints [31.426157660880673]
Self-supervised learning (SSL) pretraining strategies are emerging as a viable alternative.
We introduce an SSL framework for transferable representation learning and semantically meaningful clustering.
Our approach outperforms common SSL methods in downstream classification and clustering tasks.
arXiv Detail & Related papers (2023-10-18T21:20:44Z) - Benchmarking Self-Supervised Learning on Diverse Pathology Datasets [10.868779327544688]
Self-supervised learning has shown to be an effective method for utilizing unlabeled data.
We execute the largest-scale study of SSL pre-training on pathology image data.
For the first time, we apply SSL to the challenging task of nuclei instance segmentation.
arXiv Detail & Related papers (2022-12-09T06:38:34Z) - Dive into Self-Supervised Learning for Medical Image Analysis: Data,
Models and Tasks [8.720079280914169]
Self-supervised learning has achieved remarkable performance in various medical imaging tasks by dint of priors from massive unlabelled data.
We focus on exploiting the capacity of SSL in terms of four realistic and significant issues.
We provide a large-scale, in-depth and fine-grained study through extensive experiments on predictive, contrastive, generative and multi-SSL algorithms.
arXiv Detail & Related papers (2022-09-25T06:04:11Z) - Dissecting Self-Supervised Learning Methods for Surgical Computer Vision [51.370873913181605]
Self-Supervised Learning (SSL) methods have begun to gain traction in the general computer vision community.
The effectiveness of SSL methods in more complex and impactful domains, such as medicine and surgery, remains limited and unexplored.
We present an extensive analysis of the performance of these methods on the Cholec80 dataset for two fundamental and popular tasks in surgical context understanding, phase recognition and tool presence detection.
arXiv Detail & Related papers (2022-07-01T14:17:11Z) - Collaborative Intelligence Orchestration: Inconsistency-Based Fusion of
Semi-Supervised Learning and Active Learning [60.26659373318915]
Active learning (AL) and semi-supervised learning (SSL) are two effective, but often isolated, means to alleviate the data-hungry problem.
We propose an innovative Inconsistency-based virtual aDvErial algorithm to further investigate SSL-AL's potential superiority.
Two real-world case studies visualize the practical industrial value of applying and deploying the proposed data sampling algorithm.
arXiv Detail & Related papers (2022-06-07T13:28:43Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Uncovering the structure of clinical EEG signals with self-supervised
learning [64.4754948595556]
Supervised learning paradigms are often limited by the amount of labeled data that is available.
This phenomenon is particularly problematic in clinically-relevant data, such as electroencephalography (EEG)
By extracting information from unlabeled data, it might be possible to reach competitive performance with deep neural networks.
arXiv Detail & Related papers (2020-07-31T14:34:47Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z) - Data Efficient and Weakly Supervised Computational Pathology on Whole
Slide Images [4.001273534300757]
computational pathology has the potential to enable objective diagnosis, therapeutic response prediction and identification of new morphological features of clinical relevance.
Deep learning-based computational pathology approaches either require manual annotation of gigapixel whole slide images (WSIs) in fully-supervised settings or thousands of WSIs with slide-level labels in a weakly-supervised setting.
Here we present CLAM - Clustering-constrained attention multiple instance learning.
arXiv Detail & Related papers (2020-04-20T23:00:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.