TETRIS: Towards Exploring the Robustness of Interactive Segmentation
- URL: http://arxiv.org/abs/2402.06132v1
- Date: Fri, 9 Feb 2024 01:36:21 GMT
- Title: TETRIS: Towards Exploring the Robustness of Interactive Segmentation
- Authors: Andrey Moskalenko, Vlad Shakhuro, Anna Vorontsova, Anton Konushin,
Anton Antonov, Alexander Krapukhin, Denis Shepelev, Konstantin Soshin
- Abstract summary: We propose a methodology for finding extreme user inputs by a direct optimization in a white-box adversarial attack on the interactive segmentation model.
We report the results of an extensive evaluation of dozens of models.
- Score: 39.1981941213761
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Interactive segmentation methods rely on user inputs to iteratively update
the selection mask. A click specifying the object of interest is arguably the
most simple and intuitive interaction type, and thereby the most common choice
for interactive segmentation. However, user clicking patterns in the
interactive segmentation context remain unexplored. Accordingly, interactive
segmentation evaluation strategies rely more on intuition and common sense
rather than empirical studies (e.g., assuming that users tend to click in the
center of the area with the largest error). In this work, we conduct a real
user study to investigate real user clicking patterns. This study reveals that
the intuitive assumption made in the common evaluation strategy may not hold.
As a result, interactive segmentation models may show high scores in the
standard benchmarks, but it does not imply that they would perform well in a
real world scenario. To assess the applicability of interactive segmentation
methods, we propose a novel evaluation strategy providing a more comprehensive
analysis of a model's performance. To this end, we propose a methodology for
finding extreme user inputs by a direct optimization in a white-box adversarial
attack on the interactive segmentation model. Based on the performance with
such adversarial user inputs, we assess the robustness of interactive
segmentation models w.r.t click positions. Besides, we introduce a novel
benchmark for measuring the robustness of interactive segmentation, and report
the results of an extensive evaluation of dozens of models.
Related papers
- RClicks: Realistic Click Simulation for Benchmarking Interactive Segmentation [37.44155289954746]
We conduct a large crowdsourcing study of click patterns in an interactive segmentation scenario and collect 475K real-user clicks.
Using our model and dataset, we propose RClicks benchmark for a comprehensive comparison of existing interactive segmentation methods on realistic clicks.
According to our benchmark, in real-world usage interactive segmentation models may perform worse than it has been reported in the baseline benchmark, and most of the methods are not robust.
arXiv Detail & Related papers (2024-10-15T15:55:00Z) - Scale Disparity of Instances in Interactive Point Cloud Segmentation [15.865365305312174]
We propose ClickFormer, an innovative interactive point cloud segmentation model that accurately segments instances of both thing and stuff categories.
We employ global attention in the query-voxel transformer to mitigate the risk of generating false positives.
Experiments demonstrate that ClickFormer outperforms existing interactive point cloud segmentation methods across both indoor and outdoor datasets.
arXiv Detail & Related papers (2024-07-19T03:45:48Z) - Refining Segmentation On-the-Fly: An Interactive Framework for Point
Cloud Semantic Segmentation [9.832150567595718]
We present the first interactive framework for point cloud semantic segmentation, named InterPCSeg.
We develop an interaction simulation scheme tailored for the interactive point cloud semantic segmentation task.
We evaluate our framework on the S3DIS and ScanNet datasets with off-the-shelf segmentation networks.
arXiv Detail & Related papers (2024-03-11T03:24:58Z) - Interactive segmentation in aerial images: a new benchmark and an open
access web-based tool [2.729446374377189]
In recent years, interactive semantic segmentation proposed in computer vision has achieved an ideal state of human-computer interaction segmentation.
This study aims to bridge the gap between interactive segmentation and remote sensing analysis by conducting benchmark study on various interactive segmentation models.
arXiv Detail & Related papers (2023-08-25T04:49:49Z) - Contour-based Interactive Segmentation [4.164728134421114]
We consider a natural form of user interaction as a loose contour, and introduce a contour-based interactive segmentation method.
We demonstrate that a single contour provides the same accuracy as multiple clicks, thus reducing the required amount of user interactions.
arXiv Detail & Related papers (2023-02-13T13:35:26Z) - Interactiveness Field in Human-Object Interactions [89.13149887013905]
We introduce a previously overlooked interactiveness bimodal prior: given an object in an image, after pairing it with the humans, the generated pairs are either mostly non-interactive, or mostly interactive.
We propose new energy constraints based on the cardinality and difference in the inherent "interactiveness field" underlying interactive versus non-interactive pairs.
Our method can detect more precise pairs and thus significantly boost HOI detection performance.
arXiv Detail & Related papers (2022-04-16T05:09:25Z) - Masked Transformer for Neighhourhood-aware Click-Through Rate Prediction [74.52904110197004]
We propose Neighbor-Interaction based CTR prediction, which put this task into a Heterogeneous Information Network (HIN) setting.
In order to enhance the representation of the local neighbourhood, we consider four types of topological interaction among the nodes.
We conduct comprehensive experiments on two real world datasets and the experimental results show that our proposed method outperforms state-of-the-art CTR models significantly.
arXiv Detail & Related papers (2022-01-25T12:44:23Z) - Target-Aware Object Discovery and Association for Unsupervised Video
Multi-Object Segmentation [79.6596425920849]
This paper addresses the task of unsupervised video multi-object segmentation.
We introduce a novel approach for more accurate and efficient unseen-temporal segmentation.
We evaluate the proposed approach on DAVIS$_17$ and YouTube-VIS, and the results demonstrate that it outperforms state-of-the-art methods both in segmentation accuracy and inference speed.
arXiv Detail & Related papers (2021-04-10T14:39:44Z) - A Graph-based Interactive Reasoning for Human-Object Interaction
Detection [71.50535113279551]
We present a novel graph-based interactive reasoning model called Interactive Graph (abbr. in-Graph) to infer HOIs.
We construct a new framework to assemble in-Graph models for detecting HOIs, namely in-GraphNet.
Our framework is end-to-end trainable and free from costly annotations like human pose.
arXiv Detail & Related papers (2020-07-14T09:29:03Z) - FAIRS -- Soft Focus Generator and Attention for Robust Object
Segmentation from Extreme Points [70.65563691392987]
We present a new approach to generate object segmentation from user inputs in the form of extreme points and corrective clicks.
We demonstrate our method's ability to generate high-quality training data as well as its scalability in incorporating extreme points, guiding clicks, and corrective clicks in a principled manner.
arXiv Detail & Related papers (2020-04-04T22:25:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.