HAISTA-NET: Human Assisted Instance Segmentation Through Attention
- URL: http://arxiv.org/abs/2305.03105v3
- Date: Fri, 8 Mar 2024 13:30:58 GMT
- Title: HAISTA-NET: Human Assisted Instance Segmentation Through Attention
- Authors: Muhammed Korkmaz, T. Metin Sezgin
- Abstract summary: We propose a novel approach to enable more precise predictions and generate higher-quality segmentation masks.
Our human-assisted segmentation model, HAISTA-NET, augments the existing Strong Mask R-CNN network to incorporate human-specified partial boundaries.
We show that HAISTA-NET outperforms state-of-the art methods such as Mask R-CNN, Strong Mask R-CNN, and Mask2Former.
- Score: 3.073046540587735
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Instance segmentation is a form of image detection which has a range of
applications, such as object refinement, medical image analysis, and
image/video editing, all of which demand a high degree of accuracy. However,
this precision is often beyond the reach of what even state-of-the-art, fully
automated instance segmentation algorithms can deliver. The performance gap
becomes particularly prohibitive for small and complex objects. Practitioners
typically resort to fully manual annotation, which can be a laborious process.
In order to overcome this problem, we propose a novel approach to enable more
precise predictions and generate higher-quality segmentation masks for
high-curvature, complex and small-scale objects. Our human-assisted
segmentation model, HAISTA-NET, augments the existing Strong Mask R-CNN network
to incorporate human-specified partial boundaries. We also present a dataset of
hand-drawn partial object boundaries, which we refer to as human attention
maps. In addition, the Partial Sketch Object Boundaries (PSOB) dataset contains
hand-drawn partial object boundaries which represent curvatures of an object's
ground truth mask with several pixels. Through extensive evaluation using the
PSOB dataset, we show that HAISTA-NET outperforms state-of-the art methods such
as Mask R-CNN, Strong Mask R-CNN, and Mask2Former, achieving respective
increases of +36.7, +29.6, and +26.5 points in AP-Mask metrics for these three
models. We hope that our novel approach will set a baseline for future
human-aided deep learning models by combining fully automated and interactive
instance segmentation architectures.
Related papers
- Bridge the Points: Graph-based Few-shot Segment Anything Semantically [79.1519244940518]
Recent advancements in pre-training techniques have enhanced the capabilities of vision foundation models.
Recent studies extend the SAM to Few-shot Semantic segmentation (FSS)
We propose a simple yet effective approach based on graph analysis.
arXiv Detail & Related papers (2024-10-09T15:02:28Z) - LAC-Net: Linear-Fusion Attention-Guided Convolutional Network for Accurate Robotic Grasping Under the Occlusion [79.22197702626542]
This paper introduces a framework that explores amodal segmentation for robotic grasping in cluttered scenes.
We propose a Linear-fusion Attention-guided Convolutional Network (LAC-Net)
The results on different datasets show that our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-08-06T14:50:48Z) - MaskUno: Switch-Split Block For Enhancing Instance Segmentation [0.0]
We propose replacing mask prediction with a Switch-Split block that processes refined ROIs, classifies them, and assigns them to specialized mask predictors.
An increase in the mean Average Precision (mAP) of 2.03% was observed for the high-performing DetectoRS when trained on 80 classes.
arXiv Detail & Related papers (2024-07-31T10:12:14Z) - Exploiting Shape Cues for Weakly Supervised Semantic Segmentation [15.791415215216029]
Weakly supervised semantic segmentation (WSSS) aims to produce pixel-wise class predictions with only image-level labels for training.
We propose to exploit shape information to supplement the texture-biased property of convolutional neural networks (CNNs)
We further refine the predictions in an online fashion with a novel refinement method that takes into account both the class and the color affinities.
arXiv Detail & Related papers (2022-08-08T17:25:31Z) - Discovering Object Masks with Transformers for Unsupervised Semantic
Segmentation [75.00151934315967]
MaskDistill is a novel framework for unsupervised semantic segmentation.
Our framework does not latch onto low-level image cues and is not limited to object-centric datasets.
arXiv Detail & Related papers (2022-06-13T17:59:43Z) - Neural Volumetric Object Selection [126.04480613166194]
We introduce an approach for selecting objects in neural volumetric 3D representations, such as multi-plane images (MPI) and neural radiance fields (NeRF)
Our approach takes a set of foreground and background 2D user scribbles in one view and automatically estimates a 3D segmentation of the desired object, which can be rendered into novel views.
arXiv Detail & Related papers (2022-05-30T08:55:20Z) - SODAR: Segmenting Objects by DynamicallyAggregating Neighboring Mask
Representations [90.8752454643737]
Recent state-of-the-art one-stage instance segmentation model SOLO divides the input image into a grid and directly predicts per grid cell object masks with fully-convolutional networks.
We observe SOLO generates similar masks for an object at nearby grid cells, and these neighboring predictions can complement each other as some may better segment certain object part.
Motivated by the observed gap, we develop a novel learning-based aggregation method that improves upon SOLO by leveraging the rich neighboring information.
arXiv Detail & Related papers (2022-02-15T13:53:03Z) - Joint Object Contour Points and Semantics for Instance Segmentation [1.2117737635879038]
We propose Mask Point R-CNN aiming at promoting the neural network's attention to the object boundary.
Specifically, we innovatively extend the original human keypoint detection task to the contour point detection of any object.
As a consequence, the model will be more sensitive to the edges of the object and can capture more geometric features.
arXiv Detail & Related papers (2020-08-02T11:11:28Z) - LevelSet R-CNN: A Deep Variational Method for Instance Segmentation [79.20048372891935]
Currently, many state of the art models are based on the Mask R-CNN framework.
We propose LevelSet R-CNN, which combines the best of both worlds by obtaining powerful feature representations.
We demonstrate the effectiveness of our approach on COCO and Cityscapes datasets.
arXiv Detail & Related papers (2020-07-30T17:52:18Z) - Weakly Supervised Instance Segmentation by Deep Community Learning [39.18749732409763]
We present a weakly supervised instance segmentation algorithm based on deep community learning with multiple tasks.
We address this problem by designing a unified deep neural network architecture.
The proposed algorithm achieves state-of-the-art performance in the weakly supervised setting.
arXiv Detail & Related papers (2020-01-30T08:35:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.