Self-supervised Representation Learning with Local Aggregation for Image-based Profiling
- URL: http://arxiv.org/abs/2506.14265v2
- Date: Mon, 27 Oct 2025 15:07:02 GMT
- Title: Self-supervised Representation Learning with Local Aggregation for Image-based Profiling
- Authors: Siran Dai, Qianqian Xu, Peisong Wen, Yang Liu, Qingming Huang,
- Abstract summary: Image-based cell profiling aims to create informative representations of cell images.<n>Recent developments in non-contrastive Self-Supervised Learning have inspired this paper.<n>We introduce specialized data augmentation and representation post-processing methods tailored to cell images.
- Score: 84.52554180480037
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Image-based cell profiling aims to create informative representations of cell images. This technique is critical in drug discovery and has greatly advanced with recent improvements in computer vision. Inspired by recent developments in non-contrastive Self-Supervised Learning (SSL), this paper provides an initial exploration into training a generalizable feature extractor for cell images using such methods. However, there are two major challenges: 1) Unlike typical scenarios where each representation is based on a single image, cell profiling often involves multiple input images, making it difficult to effectively fuse all available information; and 2) There is a large difference between the distributions of cell images and natural images, causing the view-generation process in existing SSL methods to fail. To address these issues, we propose a self-supervised framework with local aggregation to improve cross-site consistency of cell representations. We introduce specialized data augmentation and representation post-processing methods tailored to cell images, which effectively address the issues mentioned above and result in a robust feature extractor. With these improvements, the proposed framework won the Cell Line Transferability challenge at CVPR 2025.
Related papers
- MaskedCLIP: Bridging the Masked and CLIP Space for Semi-Supervised Medical Vision-Language Pre-training [27.35164449801058]
State-of-the-art methods leverage either paired image-text data via vision-language pre-training or unpaired image data via self-supervised pre-training to learn foundation models.<n>We propose MaskedCLIP, a synergistic masked image modeling and contrastive language-image pre-training framework.
arXiv Detail & Related papers (2025-07-23T06:15:54Z) - MIRAM: Masked Image Reconstruction Across Multiple Scales for Breast Lesion Risk Prediction [2.0199924721373392]
Masked image modeling (MIM) has emerged as a more potent SSL technique.<n>This research paper introduces a scalable and practical SSL approach centered around more challenging pretext tasks.<n>Our hypothesis posits that reconstructing high-resolution images enables the model to attend to finer spatial details.
arXiv Detail & Related papers (2025-03-10T10:32:55Z) - Discriminative Image Generation with Diffusion Models for Zero-Shot Learning [53.44301001173801]
We present DIG-ZSL, a novel Discriminative Image Generation framework for Zero-Shot Learning.<n>We learn a discriminative class token (DCT) for each unseen class under the guidance of a pre-trained category discrimination model (CDM)<n>In this paper, the extensive experiments and visualizations on four datasets show that our DIG-ZSL: (1) generates diverse and high-quality images, (2) outperforms previous state-of-the-art nonhuman-annotated semantic prototype-based methods by a large margin, and (3) achieves comparable or better performance than baselines that leverage human-annot
arXiv Detail & Related papers (2024-12-23T02:18:54Z) - Gen-SIS: Generative Self-augmentation Improves Self-supervised Learning [52.170253590364545]
Gen-SIS is a diffusion-based augmentation technique trained exclusively on unlabeled image data.<n>We show that these self-augmentations', i.e. generative augmentations based on the vanilla SSL encoder embeddings, facilitate the training of a stronger SSL encoder.
arXiv Detail & Related papers (2024-12-02T16:20:59Z) - Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model [80.61157097223058]
A prevalent strategy to bolster image classification performance is through augmenting the training set with synthetic images generated by T2I models.
In this study, we scrutinize the shortcomings of both current generative and conventional data augmentation techniques.
We introduce an innovative inter-class data augmentation method known as Diff-Mix, which enriches the dataset by performing image translations between classes.
arXiv Detail & Related papers (2024-03-28T17:23:45Z) - CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition [73.51329037954866]
We propose a robust global representation method with cross-image correlation awareness for visual place recognition.
Our method uses the attention mechanism to correlate multiple images within a batch.
Our method outperforms state-of-the-art methods by a large margin with significantly less training time.
arXiv Detail & Related papers (2024-02-29T15:05:11Z) - Learned representation-guided diffusion models for large-image generation [58.192263311786824]
We introduce a novel approach that trains diffusion models conditioned on embeddings from self-supervised learning (SSL)
Our diffusion models successfully project these features back to high-quality histopathology and remote sensing images.
Augmenting real data by generating variations of real images improves downstream accuracy for patch-level and larger, image-scale classification tasks.
arXiv Detail & Related papers (2023-12-12T14:45:45Z) - ProS: Facial Omni-Representation Learning via Prototype-based
Self-Distillation [22.30414271893046]
Prototype-based Self-Distillation (ProS) is a novel approach for unsupervised face representation learning.
ProS consists of two vision-transformers (teacher and student models) that are trained with different augmented images.
ProS achieves state-of-the-art performance on various tasks, both in full and few-shot settings.
arXiv Detail & Related papers (2023-11-03T14:10:06Z) - Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions.
We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training.
Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z) - GenSelfDiff-HIS: Generative Self-Supervision Using Diffusion for Histopathological Image Segmentation [5.049466204159458]
Self-supervised learning (SSL) is an alternative paradigm that provides some respite by constructing models utilizing only the unannotated data.
In this paper, we propose an SSL approach for segmenting histopathological images via generative diffusion models.
Our method is based on the observation that diffusion models effectively solve an image-to-image translation task akin to a segmentation task.
arXiv Detail & Related papers (2023-09-04T09:49:24Z) - Zero-Shot Learning by Harnessing Adversarial Samples [52.09717785644816]
We propose a novel Zero-Shot Learning (ZSL) approach by Harnessing Adversarial Samples (HAS)
HAS advances ZSL through adversarial training which takes into account three crucial aspects.
We demonstrate the effectiveness of our adversarial samples approach in both ZSL and Generalized Zero-Shot Learning (GZSL) scenarios.
arXiv Detail & Related papers (2023-08-01T06:19:13Z) - Learning Nuclei Representations with Masked Image Modelling [0.41998444721319206]
Masked image modelling (MIM) is a powerful self-supervised representation learning paradigm.
We show the capacity of MIM to capture rich semantic representations of Haemotoxylin & Eosin (H&E)-stained images at the nuclear level.
arXiv Detail & Related papers (2023-06-29T17:20:05Z) - Dual-View Selective Instance Segmentation Network for Unstained Live
Adherent Cells in Differential Interference Contrast Images [11.762090096790823]
Adherent cells have low contrast structures, fading edges, and irregular morphology.
We developed a novel deep-learning algorithm for segmenting unstained adherent cells in DIC images.
Our algorithm achieves an AP_segm of 0.555, which remarkably overtakes a benchmark by a margin of 23.6%.
arXiv Detail & Related papers (2023-01-27T02:22:33Z) - SSiT: Saliency-guided Self-supervised Image Transformer for Diabetic
Retinopathy Grading [2.0790896742002274]
Saliency-guided Self-Supervised image Transformer (SSiT) is proposed for Diabetic Retinopathy grading from fundus images.
We novelly introduce saliency maps into SSL, with a goal of guiding self-supervised pre-training with domain-specific prior knowledge.
arXiv Detail & Related papers (2022-10-20T02:35:26Z) - MOGAN: Morphologic-structure-aware Generative Learning from a Single
Image [59.59698650663925]
Recently proposed generative models complete training based on only one image.
We introduce a MOrphologic-structure-aware Generative Adversarial Network named MOGAN that produces random samples with diverse appearances.
Our approach focuses on internal features including the maintenance of rational structures and variation on appearance.
arXiv Detail & Related papers (2021-03-04T12:45:23Z) - Unlabeled Data Guided Semi-supervised Histopathology Image Segmentation [34.45302976822067]
Semi-supervised learning (SSL) based on generative methods has been proven to be effective in utilizing diverse image characteristics.
We propose a new data guided generative method for histopathology image segmentation by leveraging the unlabeled data distributions.
Our method is evaluated on glands and nuclei datasets.
arXiv Detail & Related papers (2020-12-17T02:54:19Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.