Multi-granularity Interaction Simulation for Unsupervised Interactive
Segmentation
- URL: http://arxiv.org/abs/2303.13399v1
- Date: Thu, 23 Mar 2023 16:19:43 GMT
- Title: Multi-granularity Interaction Simulation for Unsupervised Interactive
Segmentation
- Authors: Kehan Li, Yian Zhao, Zhennan Wang, Zesen Cheng, Peng Jin, Xiangyang
Ji, Li Yuan, Chang Liu, Jie Chen
- Abstract summary: We introduce a Multi-granularity Interaction Simulation (MIS) approach to open up a promising direction for unsupervised interactive segmentation.
Our MIS significantly outperforms non-deep learning unsupervised methods and is even comparable with some previous deep-supervised methods without any annotation.
- Score: 38.08152990071453
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Interactive segmentation enables users to segment as needed by providing cues
of objects, which introduces human-computer interaction for many fields, such
as image editing and medical image analysis. Typically, massive and expansive
pixel-level annotations are spent to train deep models by object-oriented
interactions with manually labeled object masks. In this work, we reveal that
informative interactions can be made by simulation with semantic-consistent yet
diverse region exploration in an unsupervised paradigm. Concretely, we
introduce a Multi-granularity Interaction Simulation (MIS) approach to open up
a promising direction for unsupervised interactive segmentation. Drawing on the
high-quality dense features produced by recent self-supervised models, we
propose to gradually merge patches or regions with similar features to form
more extensive regions and thus, every merged region serves as a
semantic-meaningful multi-granularity proposal. By randomly sampling these
proposals and simulating possible interactions based on them, we provide
meaningful interaction at multiple granularities to teach the model to
understand interactions. Our MIS significantly outperforms non-deep learning
unsupervised methods and is even comparable with some previous deep-supervised
methods without any annotation.
Related papers
- MMOE: Mixture of Multimodal Interaction Experts [115.20477067767399]
MMOE stands for a mixture of multimodal interaction experts.
Our method automatically classifies data points from unlabeled multimodal datasets by their interaction type and employs specialized models for each specific interaction.
Based on our experiments, this approach improves performance on these challenging interactions by more than 10%, leading to an overall increase of 2% for tasks like sarcasm prediction.
arXiv Detail & Related papers (2023-11-16T05:31:21Z) - Interactive segmentation in aerial images: a new benchmark and an open
access web-based tool [2.729446374377189]
In recent years, interactive semantic segmentation proposed in computer vision has achieved an ideal state of human-computer interaction segmentation.
This study aims to bridge the gap between interactive segmentation and remote sensing analysis by conducting benchmark study on various interactive segmentation models.
arXiv Detail & Related papers (2023-08-25T04:49:49Z) - Improving Anomaly Segmentation with Multi-Granularity Cross-Domain
Alignment [17.086123737443714]
Anomaly segmentation plays a pivotal role in identifying atypical objects in images, crucial for hazard detection in autonomous driving systems.
While existing methods demonstrate noteworthy results on synthetic data, they often fail to consider the disparity between synthetic and real-world data domains.
We introduce the Multi-Granularity Cross-Domain Alignment framework, tailored to harmonize features across domains at both the scene and individual sample levels.
arXiv Detail & Related papers (2023-08-16T22:54:49Z) - Multi-Grained Multimodal Interaction Network for Entity Linking [65.30260033700338]
Multimodal entity linking task aims at resolving ambiguous mentions to a multimodal knowledge graph.
We propose a novel Multi-GraIned Multimodal InteraCtion Network $textbf(MIMIC)$ framework for solving the MEL task.
arXiv Detail & Related papers (2023-07-19T02:11:19Z) - Learning to Fuse Monocular and Multi-view Cues for Multi-frame Depth
Estimation in Dynamic Scenes [51.20150148066458]
We propose a novel method to learn to fuse the multi-view and monocular cues encoded as volumes without needing the generalizationally crafted masks.
Experiments on real-world datasets prove the significant effectiveness and ability of the proposed method.
arXiv Detail & Related papers (2023-04-18T13:55:24Z) - Boundary-aware Supervoxel-level Iteratively Refined Interactive 3D Image
Segmentation with Multi-agent Reinforcement Learning [33.181732857907384]
We propose to model interactive image segmentation with a Markov decision process (MDP) and solve it with reinforcement learning (RL)
Considering the large exploration space for voxel-wise prediction, multi-agent reinforcement learning is adopted, where the voxel-level policy is shared among agents.
Experimental results on four benchmark datasets have shown that the proposed method significantly outperforms the state-of-the-arts.
arXiv Detail & Related papers (2023-03-19T15:52:56Z) - A Variational Information Bottleneck Approach to Multi-Omics Data
Integration [98.6475134630792]
We propose a deep variational information bottleneck (IB) approach for incomplete multi-view observations.
Our method applies the IB framework on marginal and joint representations of the observed views to focus on intra-view and inter-view interactions that are relevant for the target.
Experiments on real-world datasets show that our method consistently achieves gain from data integration and outperforms state-of-the-art benchmarks.
arXiv Detail & Related papers (2021-02-05T06:05:39Z) - Cascaded Human-Object Interaction Recognition [175.60439054047043]
We introduce a cascade architecture for a multi-stage, coarse-to-fine HOI understanding.
At each stage, an instance localization network progressively refines HOI proposals and feeds them into an interaction recognition network.
With our carefully-designed human-centric relation features, these two modules work collaboratively towards effective interaction understanding.
arXiv Detail & Related papers (2020-03-09T17:05:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.