Learnable Prompt for Few-Shot Semantic Segmentation in Remote Sensing Domain
- URL: http://arxiv.org/abs/2404.10307v1
- Date: Tue, 16 Apr 2024 06:33:08 GMT
- Title: Learnable Prompt for Few-Shot Semantic Segmentation in Remote Sensing Domain
- Authors: Steve Andreas Immanuel, Hagai Raja Sinulingga,
- Abstract summary: Few-shot segmentation is a task to segment objects or regions of novel classes within an image given only a few annotated examples.
We use SegGPT as our base model and train it on the base classes.
To handle various object sizes which typically present in remote sensing domain, we perform patch-based prediction.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Few-shot segmentation is a task to segment objects or regions of novel classes within an image given only a few annotated examples. In the generalized setting, the task extends to segment both the base and the novel classes. The main challenge is how to train the model such that the addition of novel classes does not hurt the base classes performance, also known as catastrophic forgetting. To mitigate this issue, we use SegGPT as our base model and train it on the base classes. Then, we use separate learnable prompts to handle predictions for each novel class. To handle various object sizes which typically present in remote sensing domain, we perform patch-based prediction. To address the discontinuities along patch boundaries, we propose a patch-and-stitch technique by re-framing the problem as an image inpainting task. During inference, we also utilize image similarity search over image embeddings for prompt selection and novel class filtering to reduce false positive predictions. Based on our experiments, our proposed method boosts the weighted mIoU of a simple fine-tuned SegGPT from 15.96 to 35.08 on the validation set of few-shot OpenEarthMap dataset given in the challenge.
Related papers
- SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding [56.079013202051094]
We present SegVG, a novel method transfers the box-level annotation as signals to provide an additional pixel-level supervision for Visual Grounding.
This approach allows us to iteratively exploit the annotation as signals for both box-level regression and pixel-level segmentation.
arXiv Detail & Related papers (2024-07-03T15:30:45Z) - IFSENet : Harnessing Sparse Iterations for Interactive Few-shot Segmentation Excellence [2.822194296769473]
Few-shot segmentation techniques reduce the required number of images to learn to segment a new class.
interactive segmentation techniques only focus on incrementally improving the segmentation of one object at a time.
We combine the two concepts to drastically reduce the effort required to train segmentation models for novel classes.
arXiv Detail & Related papers (2024-03-22T10:15:53Z) - Reflection Invariance Learning for Few-shot Semantic Segmentation [53.20466630330429]
Few-shot semantic segmentation (FSS) aims to segment objects of unseen classes in query images with only a few annotated support images.
This paper proposes a fresh few-shot segmentation framework to mine the reflection invariance in a multi-view matching manner.
Experiments on both PASCAL-$5textiti$ and COCO-$20textiti$ datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-01T15:14:58Z) - Cross-domain Few-shot Segmentation with Transductive Fine-tuning [29.81009103722184]
We propose to transductively fine-tune the base model on a set of query images under the few-shot setting.
Our method could consistently and significantly improve the performance of prototypical FSS models in all cross-domain tasks.
arXiv Detail & Related papers (2022-11-27T06:44:41Z) - Rethinking Generalization in Few-Shot Classification [28.809141478504532]
Single image-level annotations only correctly describe an often small subset of an image's content.
In this paper, we take a closer look at the implications in the context of $textitfew-shot learning$.
We build on recent advances in unsupervised training of networks via masked image modelling to overcome the lack of fine-grained labels.
arXiv Detail & Related papers (2022-06-15T03:05:21Z) - Sparse Spatial Transformers for Few-Shot Learning [6.271261279657655]
Learning from limited data is challenging because data scarcity leads to a poor generalization of the trained model.
We propose a novel transformer-based neural network architecture called sparse spatial transformers.
Our method finds task-relevant features and suppresses task-irrelevant features.
arXiv Detail & Related papers (2021-09-27T10:36:32Z) - Revisiting Deep Local Descriptor for Improved Few-Shot Classification [56.74552164206737]
We show how one can improve the quality of embeddings by leveraging textbfDense textbfClassification and textbfAttentive textbfPooling.
We suggest to pool feature maps by applying attentive pooling instead of the widely used global average pooling (GAP) to prepare embeddings for few-shot classification.
arXiv Detail & Related papers (2021-03-30T00:48:28Z) - Semantically Meaningful Class Prototype Learning for One-Shot Image
Semantic Segmentation [58.96902899546075]
One-shot semantic image segmentation aims to segment the object regions for the novel class with only one annotated image.
Recent works adopt the episodic training strategy to mimic the expected situation at testing time.
We propose to leverage the multi-class label information during the episodic training. It will encourage the network to generate more semantically meaningful features for each category.
arXiv Detail & Related papers (2021-02-22T12:07:35Z) - A Few Guidelines for Incremental Few-Shot Segmentation [57.34237650765928]
Given a pretrained segmentation model and few images containing novel classes, our goal is to learn to segment novel classes while retaining the ability to segment previously seen ones.
We show how the main problems of end-to-end training in this scenario are.
i) the drift of the batch-normalization statistics toward novel classes that we can fix with batch renormalization and.
ii) the forgetting of old classes, that we can fix with regularization strategies.
arXiv Detail & Related papers (2020-11-30T20:45:56Z) - One-Shot Image Classification by Learning to Restore Prototypes [11.448423413463916]
One-shot image classification aims to train image classifiers over the dataset with only one image per category.
For one-shot learning, the existing metric learning approaches would suffer poor performance because the single training image may not be representative of the class.
We propose a simple yet effective regression model, denoted by RestoreNet, which learns a class transformation on the image feature to move the image closer to the class center in the feature space.
arXiv Detail & Related papers (2020-05-04T02:11:30Z) - CRNet: Cross-Reference Networks for Few-Shot Segmentation [59.85183776573642]
Few-shot segmentation aims to learn a segmentation model that can be generalized to novel classes with only a few training images.
With a cross-reference mechanism, our network can better find the co-occurrent objects in the two images.
Experiments on the PASCAL VOC 2012 dataset show that our network achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-03-24T04:55:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.