Comparing representations of biological data learned with different AI
paradigms, augmenting and cropping strategies
- URL: http://arxiv.org/abs/2203.04107v1
- Date: Mon, 7 Mar 2022 14:34:42 GMT
- Title: Comparing representations of biological data learned with different AI
paradigms, augmenting and cropping strategies
- Authors: Andrei Dmitrenko, Mauro M. Masiero and Nicola Zamboni
- Abstract summary: We train 16 deep learning setups on the 770k cancer cell images dataset under identical conditions.
We compare the learned representations by evaluating multiple metrics for each of three downstream tasks.
Self-supervised (implicit contrastive learning) models showed competitive performance being up to 11 times faster to train.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in computer vision and robotics enabled automated large-scale
biological image analysis. Various machine learning approaches have been
successfully applied to phenotypic profiling. However, it remains unclear how
they compare in terms of biological feature extraction. In this study, we
propose a simple CNN architecture and implement 4 different representation
learning approaches. We train 16 deep learning setups on the 770k cancer cell
images dataset under identical conditions, using different augmenting and
cropping strategies. We compare the learned representations by evaluating
multiple metrics for each of three downstream tasks: i) distance-based
similarity analysis of known drugs, ii) classification of drugs versus
controls, iii) clustering within cell lines. We also compare training times and
memory usage. Among all tested setups, multi-crops and random augmentations
generally improved performance across tasks, as expected. Strikingly,
self-supervised (implicit contrastive learning) models showed competitive
performance being up to 11 times faster to train. Self-supervised regularized
learning required the most of memory and computation to deliver arguably the
most informative features. We observe that no single combination of augmenting
and cropping strategies consistently results in top performance across tasks
and recommend prospective research directions.
Related papers
- Benchmark on Drug Target Interaction Modeling from a Structure Perspective [48.60648369785105]
Drug-target interaction prediction is crucial to drug discovery and design.
Recent methods, such as those based on graph neural networks (GNNs) and Transformers, demonstrate exceptional performance across various datasets.
We conduct a comprehensive survey and benchmark for drug-target interaction modeling from a structure perspective, via integrating tens of explicit (i.e., GNN-based) and implicit (i.e., Transformer-based) structure learning algorithms.
arXiv Detail & Related papers (2024-07-04T16:56:59Z) - Weakly Supervised Set-Consistency Learning Improves Morphological Profiling of Single-Cell Images [0.6491172192043603]
We propose a set-level consistency learning algorithm, Set-DINO, to improve learned representations of perturbation effects in single-cell images.
We conduct experiments on a large-scale Optical Pooled Screening dataset with more than 5000 genetic perturbations.
arXiv Detail & Related papers (2024-06-08T00:53:30Z) - What Makes Pre-Trained Visual Representations Successful for Robust
Manipulation? [57.92924256181857]
We find that visual representations designed for manipulation and control tasks do not necessarily generalize under subtle changes in lighting and scene texture.
We find that emergent segmentation ability is a strong predictor of out-of-distribution generalization among ViT models.
arXiv Detail & Related papers (2023-11-03T18:09:08Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Domain Generalization for Mammographic Image Analysis with Contrastive
Learning [62.25104935889111]
The training of an efficacious deep learning model requires large data with diverse styles and qualities.
A novel contrastive learning is developed to equip the deep learning models with better style generalization capability.
The proposed method has been evaluated extensively and rigorously with mammograms from various vendor style domains and several public datasets.
arXiv Detail & Related papers (2023-04-20T11:40:21Z) - Drug Synergistic Combinations Predictions via Large-Scale Pre-Training
and Graph Structure Learning [82.93806087715507]
Drug combination therapy is a well-established strategy for disease treatment with better effectiveness and less safety degradation.
Deep learning models have emerged as an efficient way to discover synergistic combinations.
Our framework achieves state-of-the-art results in comparison with other deep learning-based methods.
arXiv Detail & Related papers (2023-01-14T15:07:43Z) - NEVIS'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision
Research [96.53307645791179]
We introduce the Never-Ending VIsual-classification Stream (NEVIS'22), a benchmark consisting of a stream of over 100 visual classification tasks.
Despite being limited to classification, the resulting stream has a rich diversity of tasks from OCR, to texture analysis, scene recognition, and so forth.
Overall, NEVIS'22 poses an unprecedented challenge for current sequential learning approaches due to the scale and diversity of tasks.
arXiv Detail & Related papers (2022-11-15T18:57:46Z) - Self-supervised learning for analysis of temporal and morphological drug
effects in cancer cell imaging data [0.0]
We train a convolutional autoencoder on 1M images dataset with random augmentations and multi-crops to use as feature extractor.
We use distance-based analysis and dynamic time warping to cluster temporal patterns of 31 drugs.
We increase top-3 classification accuracy by 8% on average and mine examples of morphological feature importance maps.
arXiv Detail & Related papers (2022-03-07T14:48:13Z) - An Analysis on Ensemble Learning optimized Medical Image Classification
with Deep Convolutional Neural Networks [0.0]
We propose a reproducible medical image classification pipeline for analyzing the performance impact of ensemble learning techniques.
The pipeline consists of state-of-the-art preprocessing and image augmentation methods as well as 9 deep convolution neural network architectures.
Our results revealed that Stacking achieved the largest performance gain of up to 13% F1-score increase.
Augmenting showed consistent improvement capabilities by up to 4% and is also applicable to single model based pipelines.
arXiv Detail & Related papers (2022-01-27T10:56:11Z) - Semi-Supervising Learning, Transfer Learning, and Knowledge Distillation
with SimCLR [2.578242050187029]
Recent breakthroughs in the field of semi-supervised learning have achieved results that match state-of-the-art traditional supervised learning methods.
SimCLR is the current state-of-the-art semi-supervised learning framework for computer vision.
arXiv Detail & Related papers (2021-08-02T01:37:39Z) - Self supervised contrastive learning for digital histopathology [0.0]
We use a contrastive self-supervised learning method called SimCLR that achieved state-of-the-art results on natural-scene images.
We find that combining multiple multi-organ datasets with different types of staining and resolution properties improves the quality of the learned features.
Linear classifiers trained on top of the learned features show that networks pretrained on digital histopathology datasets perform better than ImageNet pretrained networks.
arXiv Detail & Related papers (2020-11-27T19:18:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.