Interactive segmentation in aerial images: a new benchmark and an open
access web-based tool
- URL: http://arxiv.org/abs/2308.13174v2
- Date: Thu, 7 Mar 2024 06:10:02 GMT
- Title: Interactive segmentation in aerial images: a new benchmark and an open
access web-based tool
- Authors: Zhe Wang, Shoukun Sun, Xiang Que, Xiaogang Ma
- Abstract summary: In recent years, interactive semantic segmentation proposed in computer vision has achieved an ideal state of human-computer interaction segmentation.
This study aims to bridge the gap between interactive segmentation and remote sensing analysis by conducting benchmark study on various interactive segmentation models.
- Score: 2.729446374377189
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Deep learning has gradually become powerful in segmenting and classifying
aerial images. However, in remote sensing applications, the lack of training
datasets and the difficulty of accuracy assessment have always been challenges
for the deep learning based classification. In recent years, interactive
semantic segmentation proposed in computer vision has achieved an ideal state
of human-computer interaction segmentation. It can provide expert experience
and utilize deep learning for efficient segmentation. However, few papers
discussed its application in remote sensing imagery. This study aims to bridge
the gap between interactive segmentation and remote sensing analysis by
conducting a benchmark study on various interactive segmentation models. We
assessed the performance of five state-of-the-art interactive segmentation
methods (Reviving Iterative Training with Mask Guidance for Interactive
Segmentation (RITM), FocalClick, SimpleClick, Iterative Click Loss (ICL), and
Segment Anything (SAM)) on two high-resolution aerial imagery datasets. The
Cascade-Forward Refinement approach, an innovative inference strategy for
interactive segmentation, was also introduced to enhance the segmentation
results. We evaluated these methods on various land cover types, object sizes,
and band combinations in the datasets. SimpleClick model consistently
outperformed the other methods in our experiments. Conversely, the SAM
performed less effectively than other models. Building upon these findings, we
developed an online tool called RSISeg for interactive segmentation of remote
sensing data. RSISeg incorporates a well-performing interactive model that is
finetuned with remote sensing data. Compared to existing interactive
segmentation tools, RSISeg offers robust interactivity, modifiability, and
adaptability to remote sensing data.
Related papers
- WSESeg: Introducing a Dataset for the Segmentation of Winter Sports Equipment with a Baseline for Interactive Segmentation [13.38174941551702]
We introduce a new dataset containing instance segmentation masks for ten different categories of winter sports equipment.
We carry out interactive segmentation experiments on said dataset to explore possibilities for efficient further labeling.
arXiv Detail & Related papers (2024-07-12T14:20:12Z) - Learning from Exemplars for Interactive Image Segmentation [15.37506525730218]
We introduce novel interactive segmentation frameworks for both a single object and multiple objects in the same category.
Our model reduces users' labor by around 15%, requiring two fewer clicks to achieve target IoUs 85% and 90%.
arXiv Detail & Related papers (2024-06-17T12:38:01Z) - Training-Free Robust Interactive Video Object Segmentation [82.05906654403684]
We propose a training-free prompt tracking framework for interactive video object segmentation (I-PT)
We jointly adopt sparse points and boxes tracking, filtering out unstable points and capturing object-wise information.
Our framework has demonstrated robust zero-shot video segmentation results on popular VOS datasets.
arXiv Detail & Related papers (2024-06-08T14:25:57Z) - TETRIS: Towards Exploring the Robustness of Interactive Segmentation [39.1981941213761]
We propose a methodology for finding extreme user inputs by a direct optimization in a white-box adversarial attack on the interactive segmentation model.
We report the results of an extensive evaluation of dozens of models.
arXiv Detail & Related papers (2024-02-09T01:36:21Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - RAIS: Robust and Accurate Interactive Segmentation via Continual
Learning [16.382862088005087]
We propose RAIS, a robust and accurate architecture for interactive segmentation with continuous learning.
For efficient learning on the test set, we propose a novel optimization strategy to update global and local parameters.
Our method also shows its robustness in the datasets of remote sensing and medical imaging.
arXiv Detail & Related papers (2022-10-20T03:05:44Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - Deep Relational Metric Learning [84.95793654872399]
This paper presents a deep relational metric learning framework for image clustering and retrieval.
We learn an ensemble of features that characterizes an image from different aspects to model both interclass and intraclass distributions.
Experiments on the widely-used CUB-200-2011, Cars196, and Stanford Online Products datasets demonstrate that our framework improves existing deep metric learning methods and achieves very competitive results.
arXiv Detail & Related papers (2021-08-23T09:31:18Z) - Reviving Iterative Training with Mask Guidance for Interactive
Segmentation [8.271859911016719]
Recent works on click-based interactive segmentation have demonstrated state-of-the-art results by using various inference-time optimization schemes.
We propose a simple feedforward model for click-based interactive segmentation that employs the segmentation masks from previous steps.
We find that the models trained on a combination of COCO and LVIS with diverse and high-quality annotations show performance superior to all existing models.
arXiv Detail & Related papers (2021-02-12T15:44:31Z) - A Graph-based Interactive Reasoning for Human-Object Interaction
Detection [71.50535113279551]
We present a novel graph-based interactive reasoning model called Interactive Graph (abbr. in-Graph) to infer HOIs.
We construct a new framework to assemble in-Graph models for detecting HOIs, namely in-GraphNet.
Our framework is end-to-end trainable and free from costly annotations like human pose.
arXiv Detail & Related papers (2020-07-14T09:29:03Z) - FAIRS -- Soft Focus Generator and Attention for Robust Object
Segmentation from Extreme Points [70.65563691392987]
We present a new approach to generate object segmentation from user inputs in the form of extreme points and corrective clicks.
We demonstrate our method's ability to generate high-quality training data as well as its scalability in incorporating extreme points, guiding clicks, and corrective clicks in a principled manner.
arXiv Detail & Related papers (2020-04-04T22:25:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.