Related papers: Insight Any Instance: Promptable Instance Segmentation for Remote Sensing Images

Insight Any Instance: Promptable Instance Segmentation for Remote Sensing Images

URL: http://arxiv.org/abs/2409.07022v1
Date: Wed, 11 Sep 2024 05:31:50 GMT
Title: Insight Any Instance: Promptable Instance Segmentation for Remote Sensing Images
Authors: Xuexue Li,
Abstract summary: Instance segmentation of remote sensing images (RSIs) is an essential task for a wide range of applications such as land planning and intelligent transport. Most of the instance segmentation models are based on deep feature learning and contain operations such as multiple downsampling. Inspired by the recent superior performance of prompt learning in visual tasks, we propose a new prompt paradigm to address the above issues.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Instance segmentation of remote sensing images (RSIs) is an essential task for a wide range of applications such as land planning and intelligent transport. Instance segmentation of RSIs is constantly plagued by the unbalanced ratio of foreground and background and limited instance size. And most of the instance segmentation models are based on deep feature learning and contain operations such as multiple downsampling, which is harmful to instance segmentation of RSIs, and thus the performance is still limited. Inspired by the recent superior performance of prompt learning in visual tasks, we propose a new prompt paradigm to address the above issues. Based on the existing instance segmentation model, firstly, a local prompt module is designed to mine local prompt information from original local tokens for specific instances; secondly, a global-to-local prompt module is designed to model the contextual information from the global tokens to the local tokens where the instances are located for specific instances. Finally, a proposal's area loss function is designed to add a decoupling dimension for proposals on the scale to better exploit the potential of the above two prompt modules. It is worth mentioning that our proposed approach can extend the instance segmentation model to a promptable instance segmentation model, i.e., to segment the instances with the specific boxes prompt. The time consumption for each promptable instance segmentation process is only 40 ms. The paper evaluates the effectiveness of our proposed approach based on several existing models in four instance segmentation datasets of RSIs, and thorough experiments prove that our proposed approach is effective for addressing the above issues and is a competitive model for instance segmentation of RSIs.

Related papers

From Semantic To Instance: A Semi-Self-Supervised Learning Approach [6.092973123903838]
We propose a semi-self-supervised learning approach that requires minimal manual annotation to develop a high-performing instance segmentation model.<n>We use GLMask, an image-mask representation for the model to focus on shape, texture, and pattern while minimizing its dependence on color features.<n>The proposed approach substantially outperforms the conventional instance segmentation models, establishing a state-of-the-art wheat head instance segmentation model with mAP@50 of 98.5%.
arXiv Detail & Related papers (2025-06-19T19:38:01Z)
ISAR: A Benchmark for Single- and Few-Shot Object Instance Segmentation and Re-Identification [24.709695178222862]
We propose ISAR, a benchmark and baseline method for single- and few-shot object identification. We provide a semi-synthetic dataset of video sequences with ground-truth semantic annotations. Our benchmark aligns with the emerging research trend of unifying Multi-Object Tracking, Video Object, and Re-identification.
arXiv Detail & Related papers (2023-11-05T18:51:33Z)
Lidar Panoptic Segmentation and Tracking without Bells and Whistles [48.078270195629415]
We propose a detection-centric network for lidar segmentation and tracking. One of the core components of our network is the object instance detection branch. We evaluate our method on several 3D/4D LPS benchmarks and observe that our model establishes a new state-of-the-art among open-sourced models.
arXiv Detail & Related papers (2023-10-19T04:44:43Z)
PANet: LiDAR Panoptic Segmentation with Sparse Instance Proposal and Aggregation [15.664835767712775]
This work proposes a new LPS framework named PANet to eliminate the dependency on the offset branch. PaNet achieves state-of-the-art performance among published works on the Semantic KITII validation and nuScenes validation for the panoptic segmentation task.
arXiv Detail & Related papers (2023-06-27T10:02:28Z)
Active Pointly-Supervised Instance Segmentation [106.38955769817747]
We present an economic active learning setting, named active pointly-supervised instance segmentation (APIS) APIS starts with box-level annotations and iteratively samples a point within the box and asks if it falls on the object. The model developed with these strategies yields consistent performance gain on the challenging MS-COCO dataset.
arXiv Detail & Related papers (2022-07-23T11:25:24Z)
Sparse Instance Activation for Real-Time Instance Segmentation [72.23597664935684]
We propose a conceptually novel, efficient, and fully convolutional framework for real-time instance segmentation. SparseInst has extremely fast inference speed and achieves 40 FPS and 37.9 AP on the COCO benchmark.
arXiv Detail & Related papers (2022-03-24T03:15:39Z)
Learning to Aggregate Multi-Scale Context for Instance Segmentation in Remote Sensing Images [28.560068780733342]
A novel context aggregation network (CATNet) is proposed to improve the feature extraction process. The proposed model exploits three lightweight plug-and-play modules, namely dense feature pyramid network (DenseFPN), spatial context pyramid ( SCP), and hierarchical region of interest extractor (HRoIE)
arXiv Detail & Related papers (2021-11-22T08:55:25Z)
Semantic Attention and Scale Complementary Network for Instance Segmentation in Remote Sensing Images [54.08240004593062]
We propose an end-to-end multi-category instance segmentation model, which consists of a Semantic Attention (SEA) module and a Scale Complementary Mask Branch (SCMB) SEA module contains a simple fully convolutional semantic segmentation branch with extra supervision to strengthen the activation of interest instances on the feature map. SCMB extends the original single mask branch to trident mask branches and introduces complementary mask supervision at different scales.
arXiv Detail & Related papers (2021-07-25T08:53:59Z)
SOLO: A Simple Framework for Instance Segmentation [84.00519148562606]
"instance categories" assigns categories to each pixel within an instance according to the instance's location. "SOLO" is a simple, direct, and fast framework for instance segmentation with strong performance. Our approach achieves state-of-the-art results for instance segmentation in terms of both speed and accuracy.
arXiv Detail & Related papers (2021-06-30T09:56:54Z)
Target-Aware Object Discovery and Association for Unsupervised Video Multi-Object Segmentation [79.6596425920849]
This paper addresses the task of unsupervised video multi-object segmentation. We introduce a novel approach for more accurate and efficient unseen-temporal segmentation. We evaluate the proposed approach on DAVIS$_17$ and YouTube-VIS, and the results demonstrate that it outperforms state-of-the-art methods both in segmentation accuracy and inference speed.
arXiv Detail & Related papers (2021-04-10T14:39:44Z)
Part-aware Prototype Network for Few-shot Semantic Segmentation [50.581647306020095]
We propose a novel few-shot semantic segmentation framework based on the prototype representation. Our key idea is to decompose the holistic class representation into a set of part-aware prototypes. We develop a novel graph neural network model to generate and enhance the proposed part-aware prototypes.
arXiv Detail & Related papers (2020-07-13T11:03:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.