Interactive Object Segmentation with Dynamic Click Transform
- URL: http://arxiv.org/abs/2106.10465v1
- Date: Sat, 19 Jun 2021 10:13:37 GMT
- Title: Interactive Object Segmentation with Dynamic Click Transform
- Authors: Chun-Tse Lin, Wei-Chih Tu, Chih-Ting Liu, Shao-Yi Chien
- Abstract summary: We propose a Dynamic Click Transform Network(DCT-Net), consisting of Spatial-DCT and Feature-DCT, to better represent user interactions.
We demonstrate the effectiveness of our proposed method and achieve favorable performance compared to the state-of-the-art on three standard benchmark datasets.
- Score: 27.709779682559883
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the interactive segmentation, users initially click on the target object
to segment the main body and then provide corrections on mislabeled regions to
iteratively refine the segmentation masks. Most existing methods transform
these user-provided clicks into interaction maps and concatenate them with
image as the input tensor. Typically, the interaction maps are determined by
measuring the distance of each pixel to the clicked points, ignoring the
relation between clicks and mislabeled regions. We propose a Dynamic Click
Transform Network~(DCT-Net), consisting of Spatial-DCT and Feature-DCT, to
better represent user interactions. Spatial-DCT transforms each user-provided
click with individual diffusion distance according to the target scale, and
Feature-DCT normalizes the extracted feature map to a specific distribution
predicted from the clicked points. We demonstrate the effectiveness of our
proposed method and achieve favorable performance compared to the
state-of-the-art on three standard benchmark datasets.
Related papers
- KP-RED: Exploiting Semantic Keypoints for Joint 3D Shape Retrieval and Deformation [87.23575166061413]
KP-RED is a unified KeyPoint-driven REtrieval and Deformation framework.
It takes object scans as input and jointly retrieves and deforms the most geometrically similar CAD models.
arXiv Detail & Related papers (2024-03-15T08:44:56Z) - CorrMatch: Label Propagation via Correlation Matching for
Semi-Supervised Semantic Segmentation [73.89509052503222]
This paper presents a simple but performant semi-supervised semantic segmentation approach, called CorrMatch.
We observe that the correlation maps not only enable clustering pixels of the same category easily but also contain good shape information.
We propose to conduct pixel propagation by modeling the pairwise similarities of pixels to spread the high-confidence pixels and dig out more.
Then, we perform region propagation to enhance the pseudo labels with accurate class-agnostic masks extracted from the correlation maps.
arXiv Detail & Related papers (2023-06-07T10:02:29Z) - AGILE3D: Attention Guided Interactive Multi-object 3D Segmentation [32.63772366307106]
We introduce AGILE3D, an efficient, attention-based model that supports simultaneous segmentation of multiple 3D objects.
Our core idea is to encode user clicks as spatial-temporal queries and enable explicit interactions between click queries and the 3D scene.
In experiments with four different 3D point cloud datasets, AGILE3D sets a new state-of-the-art.
arXiv Detail & Related papers (2023-06-01T17:59:10Z) - Contour-based Interactive Segmentation [4.164728134421114]
We consider a natural form of user interaction as a loose contour, and introduce a contour-based interactive segmentation method.
We demonstrate that a single contour provides the same accuracy as multiple clicks, thus reducing the required amount of user interactions.
arXiv Detail & Related papers (2023-02-13T13:35:26Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - Masked Transformer for Neighhourhood-aware Click-Through Rate Prediction [74.52904110197004]
We propose Neighbor-Interaction based CTR prediction, which put this task into a Heterogeneous Information Network (HIN) setting.
In order to enhance the representation of the local neighbourhood, we consider four types of topological interaction among the nodes.
We conduct comprehensive experiments on two real world datasets and the experimental results show that our proposed method outperforms state-of-the-art CTR models significantly.
arXiv Detail & Related papers (2022-01-25T12:44:23Z) - Interactive segmentation using U-Net with weight map and dynamic user
interactions [0.0]
We propose a novel interactive segmentation framework, where user clicks are dynamically adapted in size based on the current segmentation mask.
The clicked regions form a weight map and are fed to a deep neural network as a novel weighted loss function.
Applying dynamic user click sizes increases the overall accuracy by 5.60% and 10.39% respectively by utilizing only a single user interaction.
arXiv Detail & Related papers (2021-11-18T15:08:11Z) - Clicking Matters:Towards Interactive Human Parsing [60.35351491254932]
This work is the first attempt to tackle the human parsing task under the interactive setting.
Our IHP solution achieves 85% mIoU on the benchmark LIP, 80% mIoU on PASCAL-Person-Part and CIHP, 75% mIoU on Helen with only 1.95, 3.02, 2.84 and 1.09 clicks per class respectively.
arXiv Detail & Related papers (2021-11-11T11:47:53Z) - Localized Interactive Instance Segmentation [24.55415554455844]
We propose a clicking scheme wherein user interactions are restricted to the proximity of the object.
We demonstrate the effectiveness of our proposed clicking scheme and localization strategy through detailed experimentation.
arXiv Detail & Related papers (2020-10-18T23:24:09Z) - FAIRS -- Soft Focus Generator and Attention for Robust Object
Segmentation from Extreme Points [70.65563691392987]
We present a new approach to generate object segmentation from user inputs in the form of extreme points and corrective clicks.
We demonstrate our method's ability to generate high-quality training data as well as its scalability in incorporating extreme points, guiding clicks, and corrective clicks in a principled manner.
arXiv Detail & Related papers (2020-04-04T22:25:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.