Few-shot Multispectral Segmentation with Representations Generated by Reinforcement Learning
- URL: http://arxiv.org/abs/2311.11827v2
- Date: Fri, 22 Nov 2024 19:04:08 GMT
- Title: Few-shot Multispectral Segmentation with Representations Generated by Reinforcement Learning
- Authors: Dilith Jayakody, Thanuja Ambegoda,
- Abstract summary: We propose a novel approach for improving few-shot segmentation performance on multispectral images using reinforcement learning.
Our methodology involves training an agent to identify the most informative expressions using a small dataset.
Due to the limited length of the expressions, the model receives useful representations without any added risk of overfitting.
- Score: 0.0
- License:
- Abstract: The task of segmentation of multispectral images, which are images with numerous channels or bands, each capturing a specific range of wavelengths of electromagnetic radiation, has been previously explored in contexts with large amounts of labeled data. However, these models tend not to generalize well to datasets of smaller size. In this paper, we propose a novel approach for improving few-shot segmentation performance on multispectral images using reinforcement learning to generate representations. These representations are generated as mathematical expressions between channels and are tailored to the specific class being segmented. Our methodology involves training an agent to identify the most informative expressions using a small dataset, which can include as few as a single labeled sample, updating the dataset using these expressions, and then using the updated dataset to perform segmentation. Due to the limited length of the expressions, the model receives useful representations without any added risk of overfitting. We evaluate the effectiveness of our approach on samples of several multispectral datasets and demonstrate its effectiveness in boosting the performance of segmentation algorithms in few-shot contexts. The code is available at https://github.com/dilithjay/IndexRLSeg.
Related papers
- Correlation-aware active learning for surgery video segmentation [13.327429312047396]
This work proposes a novel AL strategy for surgery video segmentation, COWAL, COrrelation-aWare Active Learning.
Our approach involves projecting images into a latent space that has been fine-tuned using contrastive learning and then selecting a fixed number of representative images from local clusters of video frames.
We demonstrate the effectiveness of this approach on two video datasets of surgical instruments and three real-world video datasets.
arXiv Detail & Related papers (2023-11-15T09:30:52Z) - DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion
Models [61.906934570771256]
We present a generic dataset generation model that can produce diverse synthetic images and perception annotations.
Our method builds upon the pre-trained diffusion model and extends text-guided image synthesis to perception data generation.
We show that the rich latent code of the diffusion model can be effectively decoded as accurate perception annotations using a decoder module.
arXiv Detail & Related papers (2023-08-11T14:38:11Z) - Unsupervised Segmentation of Fetal Brain MRI using Deep Learning
Cascaded Registration [2.494736313545503]
Traditional deep learning-based automatic segmentation requires extensive training data with ground-truth labels.
We propose a novel method based on multi-atlas segmentation, that accurately segments multiple tissues without relying on labeled data for training.
Our method employs a cascaded deep learning network for 3D image registration, which computes small, incremental deformations to the moving image to align it precisely with the fixed image.
arXiv Detail & Related papers (2023-07-07T13:17:12Z) - Diffusion Models for Open-Vocabulary Segmentation [79.02153797465324]
OVDiff is a novel method that leverages generative text-to-image diffusion models for unsupervised open-vocabulary segmentation.
It relies solely on pre-trained components and outputs the synthesised segmenter directly, without training.
arXiv Detail & Related papers (2023-06-15T17:51:28Z) - Baking in the Feature: Accelerating Volumetric Segmentation by Rendering
Feature Maps [42.34064154798376]
We propose to use features extracted with models trained on large existing datasets to improve segmentation performance.
We bake this feature representation into a Neural Radiance Field (NeRF) by volumetrically rendering feature maps and supervising on features extracted from each input image.
Our experiments show that our method achieves higher segmentation accuracy with fewer semantic annotations than existing methods over a wide range of scenes.
arXiv Detail & Related papers (2022-09-26T14:52:10Z) - Scaling up Multi-domain Semantic Segmentation with Sentence Embeddings [81.09026586111811]
We propose an approach to semantic segmentation that achieves state-of-the-art supervised performance when applied in a zero-shot setting.
This is achieved by replacing each class label with a vector-valued embedding of a short paragraph that describes the class.
The resulting merged semantic segmentation dataset of over 2 Million images enables training a model that achieves performance equal to that of state-of-the-art supervised methods on 7 benchmark datasets.
arXiv Detail & Related papers (2022-02-04T07:19:09Z) - Label-Efficient Semantic Segmentation with Diffusion Models [27.01899943738203]
We demonstrate that diffusion models can also serve as an instrument for semantic segmentation.
In particular, for several pretrained diffusion models, we investigate the intermediate activations from the networks that perform the Markov step of the reverse diffusion process.
We show that these activations effectively capture the semantic information from an input image and appear to be excellent pixel-level representations for the segmentation problem.
arXiv Detail & Related papers (2021-12-06T15:55:30Z) - Multi-dataset Pretraining: A Unified Model for Semantic Segmentation [97.61605021985062]
We propose a unified framework, termed as Multi-Dataset Pretraining, to take full advantage of the fragmented annotations of different datasets.
This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets.
In order to better model the relationship among images and classes from different datasets, we extend the pixel level embeddings via cross dataset mixing.
arXiv Detail & Related papers (2021-06-08T06:13:11Z) - Part-aware Prototype Network for Few-shot Semantic Segmentation [50.581647306020095]
We propose a novel few-shot semantic segmentation framework based on the prototype representation.
Our key idea is to decompose the holistic class representation into a set of part-aware prototypes.
We develop a novel graph neural network model to generate and enhance the proposed part-aware prototypes.
arXiv Detail & Related papers (2020-07-13T11:03:09Z) - Unsupervised Learning Consensus Model for Dynamic Texture Videos
Segmentation [12.462608802359936]
We present an effective unsupervised learning consensus model for the segmentation of dynamic texture (ULCM)
In the proposed model, the set of values of the requantized local binary patterns (LBP) histogram around the pixel to be classified are used as features.
Experiments conducted on the challenging SynthDB dataset show that ULCM is significantly faster, easier to code, simple and has limited parameters.
arXiv Detail & Related papers (2020-06-29T16:40:59Z) - CRNet: Cross-Reference Networks for Few-Shot Segmentation [59.85183776573642]
Few-shot segmentation aims to learn a segmentation model that can be generalized to novel classes with only a few training images.
With a cross-reference mechanism, our network can better find the co-occurrent objects in the two images.
Experiments on the PASCAL VOC 2012 dataset show that our network achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-03-24T04:55:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.