Getting to 99% Accuracy in Interactive Segmentation
- URL: http://arxiv.org/abs/2003.07932v1
- Date: Tue, 17 Mar 2020 20:50:22 GMT
- Title: Getting to 99% Accuracy in Interactive Segmentation
- Authors: Marco Forte, Brian Price, Scott Cohen, Ning Xu, Fran\c{c}ois Piti\'e
- Abstract summary: Recent deep-learning based interactive segmentation algorithms have made significant progress in handling complex images.
Yet, deep learning techniques tend to plateau once this rough selection has been reached.
We propose a novel interactive architecture and a novel training scheme that are both tailored to better exploit the user workflow.
- Score: 18.207714624149595
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Interactive object cutout tools are the cornerstone of the image editing
workflow. Recent deep-learning based interactive segmentation algorithms have
made significant progress in handling complex images and rough binary
selections can typically be obtained with just a few clicks. Yet, deep learning
techniques tend to plateau once this rough selection has been reached. In this
work, we interpret this plateau as the inability of current algorithms to
sufficiently leverage each user interaction and also as the limitations of
current training/testing datasets.
We propose a novel interactive architecture and a novel training scheme that
are both tailored to better exploit the user workflow. We also show that
significant improvements can be further gained by introducing a synthetic
training dataset that is specifically designed for complex object boundaries.
Comprehensive experiments support our approach, and our network achieves state
of the art performance.
Related papers
- Flex: End-to-End Text-Instructed Visual Navigation with Foundation Models [59.892436892964376]
We investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies.
Our findings are synthesized in Flex (Fly-lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors.
We demonstrate the effectiveness of this approach on quadrotor fly-to-target tasks, where agents trained via behavior cloning successfully generalize to real-world scenes.
arXiv Detail & Related papers (2024-10-16T19:59:31Z) - Deep ContourFlow: Advancing Active Contours with Deep Learning [3.9948520633731026]
We present a framework for both unsupervised and one-shot approaches for image segmentation.
It is capable of capturing complex object boundaries without the need for extensive labeled training data.
This is particularly required in histology, a field facing a significant shortage of annotations.
arXiv Detail & Related papers (2024-07-15T13:12:34Z) - Unsupervised Semantic Segmentation Through Depth-Guided Feature Correlation and Sampling [14.88236554564287]
In this work, we build upon advances in unsupervised learning by incorporating information about the structure of a scene into the training process.
We achieve this by (1) learning depth-feature correlation by spatially correlate the feature maps with the depth maps to induce knowledge about the structure of the scene.
We then implement farthest-point sampling to more effectively select relevant features by utilizing 3D sampling techniques on depth information of the scene.
arXiv Detail & Related papers (2023-09-21T11:47:01Z) - Improving Human-Object Interaction Detection via Virtual Image Learning [68.56682347374422]
Human-Object Interaction (HOI) detection aims to understand the interactions between humans and objects.
In this paper, we propose to alleviate the impact of such an unbalanced distribution via Virtual Image Leaning (VIL)
A novel label-to-image approach, Multiple Steps Image Creation (MUSIC), is proposed to create a high-quality dataset that has a consistent distribution with real images.
arXiv Detail & Related papers (2023-08-04T10:28:48Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - RAIS: Robust and Accurate Interactive Segmentation via Continual
Learning [16.382862088005087]
We propose RAIS, a robust and accurate architecture for interactive segmentation with continuous learning.
For efficient learning on the test set, we propose a novel optimization strategy to update global and local parameters.
Our method also shows its robustness in the datasets of remote sensing and medical imaging.
arXiv Detail & Related papers (2022-10-20T03:05:44Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - Context-based Deep Learning Architecture with Optimal Integration Layer
for Image Parsing [0.0]
The proposed three-layer context-based deep architecture is capable of integrating context explicitly with visual information.
The experimental outcomes when evaluated on benchmark datasets are promising.
arXiv Detail & Related papers (2022-04-13T07:35:39Z) - Learning Co-segmentation by Segment Swapping for Retrieval and Discovery [67.6609943904996]
The goal of this work is to efficiently identify visually similar patterns from a pair of images.
We generate synthetic training pairs by selecting object segments in an image and copy-pasting them into another image.
We show our approach provides clear improvements for artwork details retrieval on the Brueghel dataset.
arXiv Detail & Related papers (2021-10-29T16:51:16Z) - EdgeFlow: Achieving Practical Interactive Segmentation with Edge-Guided
Flow [5.696221390328458]
We propose EdgeFlow, a novel architecture that fully utilizes interactive information of user clicks with edge-guided flow.
Our method achieves state-of-the-art performance without any post-processing or iterative optimization scheme.
With the proposed method, we develop an efficient interactive segmentation tool for practical data annotation tasks.
arXiv Detail & Related papers (2021-09-20T10:07:07Z) - Learning to Segment Human Body Parts with Synthetically Trained Deep
Convolutional Networks [58.0240970093372]
This paper presents a new framework for human body part segmentation based on Deep Convolutional Neural Networks trained using only synthetic data.
The proposed approach achieves cutting-edge results without the need of training the models with real annotated data of human body parts.
arXiv Detail & Related papers (2021-02-02T12:26:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.