Interactive segmentation in aerial images: a new benchmark and an open
access web-based tool
- URL: http://arxiv.org/abs/2308.13174v2
- Date: Thu, 7 Mar 2024 06:10:02 GMT
- Title: Interactive segmentation in aerial images: a new benchmark and an open
access web-based tool
- Authors: Zhe Wang, Shoukun Sun, Xiang Que, Xiaogang Ma
- Abstract summary: In recent years, interactive semantic segmentation proposed in computer vision has achieved an ideal state of human-computer interaction segmentation.
This study aims to bridge the gap between interactive segmentation and remote sensing analysis by conducting benchmark study on various interactive segmentation models.
- Score: 2.729446374377189
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Deep learning has gradually become powerful in segmenting and classifying
aerial images. However, in remote sensing applications, the lack of training
datasets and the difficulty of accuracy assessment have always been challenges
for the deep learning based classification. In recent years, interactive
semantic segmentation proposed in computer vision has achieved an ideal state
of human-computer interaction segmentation. It can provide expert experience
and utilize deep learning for efficient segmentation. However, few papers
discussed its application in remote sensing imagery. This study aims to bridge
the gap between interactive segmentation and remote sensing analysis by
conducting a benchmark study on various interactive segmentation models. We
assessed the performance of five state-of-the-art interactive segmentation
methods (Reviving Iterative Training with Mask Guidance for Interactive
Segmentation (RITM), FocalClick, SimpleClick, Iterative Click Loss (ICL), and
Segment Anything (SAM)) on two high-resolution aerial imagery datasets. The
Cascade-Forward Refinement approach, an innovative inference strategy for
interactive segmentation, was also introduced to enhance the segmentation
results. We evaluated these methods on various land cover types, object sizes,
and band combinations in the datasets. SimpleClick model consistently
outperformed the other methods in our experiments. Conversely, the SAM
performed less effectively than other models. Building upon these findings, we
developed an online tool called RSISeg for interactive segmentation of remote
sensing data. RSISeg incorporates a well-performing interactive model that is
finetuned with remote sensing data. Compared to existing interactive
segmentation tools, RSISeg offers robust interactivity, modifiability, and
adaptability to remote sensing data.
Related papers
- RClicks: Realistic Click Simulation for Benchmarking Interactive Segmentation [37.44155289954746]
We conduct a large crowdsourcing study of click patterns in an interactive segmentation scenario and collect 475K real-user clicks.
Using our model and dataset, we propose RClicks benchmark for a comprehensive comparison of existing interactive segmentation methods on realistic clicks.
According to our benchmark, in real-world usage interactive segmentation models may perform worse than it has been reported in the baseline benchmark, and most of the methods are not robust.
arXiv Detail & Related papers (2024-10-15T15:55:00Z) - Visual-Geometric Collaborative Guidance for Affordance Learning [63.038406948791454]
We propose a visual-geometric collaborative guided affordance learning network that incorporates visual and geometric cues.
Our method outperforms the representative models regarding objective metrics and visual quality.
arXiv Detail & Related papers (2024-10-15T07:35:51Z) - WSESeg: Introducing a Dataset for the Segmentation of Winter Sports Equipment with a Baseline for Interactive Segmentation [13.38174941551702]
We introduce a new dataset containing instance segmentation masks for ten different categories of winter sports equipment.
We carry out interactive segmentation experiments on said dataset to explore possibilities for efficient further labeling.
arXiv Detail & Related papers (2024-07-12T14:20:12Z) - TETRIS: Towards Exploring the Robustness of Interactive Segmentation [39.1981941213761]
We propose a methodology for finding extreme user inputs by a direct optimization in a white-box adversarial attack on the interactive segmentation model.
We report the results of an extensive evaluation of dozens of models.
arXiv Detail & Related papers (2024-02-09T01:36:21Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - RefSAM: Efficiently Adapting Segmenting Anything Model for Referring Video Object Segmentation [53.4319652364256]
This paper presents the RefSAM model, which explores the potential of SAM for referring video object segmentation.
Our proposed approach adapts the original SAM model to enhance cross-modality learning by employing a lightweight Cross-RValModal.
We employ a parameter-efficient tuning strategy to align and fuse the language and vision features effectively.
arXiv Detail & Related papers (2023-07-03T13:21:58Z) - RAIS: Robust and Accurate Interactive Segmentation via Continual
Learning [16.382862088005087]
We propose RAIS, a robust and accurate architecture for interactive segmentation with continuous learning.
For efficient learning on the test set, we propose a novel optimization strategy to update global and local parameters.
Our method also shows its robustness in the datasets of remote sensing and medical imaging.
arXiv Detail & Related papers (2022-10-20T03:05:44Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - Deep Relational Metric Learning [84.95793654872399]
This paper presents a deep relational metric learning framework for image clustering and retrieval.
We learn an ensemble of features that characterizes an image from different aspects to model both interclass and intraclass distributions.
Experiments on the widely-used CUB-200-2011, Cars196, and Stanford Online Products datasets demonstrate that our framework improves existing deep metric learning methods and achieves very competitive results.
arXiv Detail & Related papers (2021-08-23T09:31:18Z) - Reviving Iterative Training with Mask Guidance for Interactive
Segmentation [8.271859911016719]
Recent works on click-based interactive segmentation have demonstrated state-of-the-art results by using various inference-time optimization schemes.
We propose a simple feedforward model for click-based interactive segmentation that employs the segmentation masks from previous steps.
We find that the models trained on a combination of COCO and LVIS with diverse and high-quality annotations show performance superior to all existing models.
arXiv Detail & Related papers (2021-02-12T15:44:31Z) - FAIRS -- Soft Focus Generator and Attention for Robust Object
Segmentation from Extreme Points [70.65563691392987]
We present a new approach to generate object segmentation from user inputs in the form of extreme points and corrective clicks.
We demonstrate our method's ability to generate high-quality training data as well as its scalability in incorporating extreme points, guiding clicks, and corrective clicks in a principled manner.
arXiv Detail & Related papers (2020-04-04T22:25:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.