Mask2Anomaly: Mask Transformer for Universal Open-set Segmentation
- URL: http://arxiv.org/abs/2309.04573v2
- Date: Tue, 12 Sep 2023 14:36:02 GMT
- Title: Mask2Anomaly: Mask Transformer for Universal Open-set Segmentation
- Authors: Shyam Nandan Rai, Fabio Cermelli, Barbara Caputo, Carlo Masone
- Abstract summary: We propose a paradigm change by shifting from a per-pixel classification to a mask classification.
Our mask-based method, Mask2Anomaly, demonstrates the feasibility of integrating a mask-classification architecture.
By comprehensive qualitative and qualitative evaluation, we show Mask2Anomaly achieves new state-of-the-art results.
- Score: 29.43462426812185
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Segmenting unknown or anomalous object instances is a critical task in
autonomous driving applications, and it is approached traditionally as a
per-pixel classification problem. However, reasoning individually about each
pixel without considering their contextual semantics results in high
uncertainty around the objects' boundaries and numerous false positives. We
propose a paradigm change by shifting from a per-pixel classification to a mask
classification. Our mask-based method, Mask2Anomaly, demonstrates the
feasibility of integrating a mask-classification architecture to jointly
address anomaly segmentation, open-set semantic segmentation, and open-set
panoptic segmentation. Mask2Anomaly includes several technical novelties that
are designed to improve the detection of anomalies/unknown objects: i) a global
masked attention module to focus individually on the foreground and background
regions; ii) a mask contrastive learning that maximizes the margin between an
anomaly and known classes; iii) a mask refinement solution to reduce false
positives; and iv) a novel approach to mine unknown instances based on the
mask-architecture properties. By comprehensive qualitative and qualitative
evaluation, we show Mask2Anomaly achieves new state-of-the-art results across
the benchmarks of anomaly segmentation, open-set semantic segmentation, and
open-set panoptic segmentation.
Related papers
- Pluralistic Salient Object Detection [108.74650817891984]
We introduce pluralistic salient object detection (PSOD), a novel task aimed at generating multiple plausible salient segmentation results for a given input image.
We present two new SOD datasets "DUTS-MM" and "DUS-MQ", along with newly designed evaluation metrics.
arXiv Detail & Related papers (2024-09-04T01:38:37Z) - Variance-insensitive and Target-preserving Mask Refinement for
Interactive Image Segmentation [68.16510297109872]
Point-based interactive image segmentation can ease the burden of mask annotation in applications such as semantic segmentation and image editing.
We introduce a novel method, Variance-Insensitive and Target-Preserving Mask Refinement to enhance segmentation quality with fewer user inputs.
Experiments on GrabCut, Berkeley, SBD, and DAVIS datasets demonstrate our method's state-of-the-art performance in interactive image segmentation.
arXiv Detail & Related papers (2023-12-22T02:31:31Z) - Unmasking Anomalies in Road-Scene Segmentation [18.253109627901566]
Anomaly segmentation is a critical task for driving applications.
We propose a paradigm change by shifting from a per-pixel classification to a mask classification.
Mask2Anomaly demonstrates the feasibility of integrating an anomaly detection method in a mask-classification architecture.
arXiv Detail & Related papers (2023-07-25T08:23:10Z) - DFormer: Diffusion-guided Transformer for Universal Image Segmentation [86.73405604947459]
The proposed DFormer views universal image segmentation task as a denoising process using a diffusion model.
At inference, our DFormer directly predicts the masks and corresponding categories from a set of randomly-generated masks.
Our DFormer outperforms the recent diffusion-based panoptic segmentation method Pix2Seq-D with a gain of 3.6% on MS COCO val 2017 set.
arXiv Detail & Related papers (2023-06-06T06:33:32Z) - Discovering Object Masks with Transformers for Unsupervised Semantic
Segmentation [75.00151934315967]
MaskDistill is a novel framework for unsupervised semantic segmentation.
Our framework does not latch onto low-level image cues and is not limited to object-centric datasets.
arXiv Detail & Related papers (2022-06-13T17:59:43Z) - Pseudo-mask Matters inWeakly-supervised Semantic Segmentation [24.73662587701187]
We find some matters related to the pseudo-masks, including high quality pseudo-masks generation from class activation maps (CAMs) and training with noisy pseudo-mask supervision.
We propose the following designs to push the performance to new state-of-art: (i) Coefficient of Variation Smoothing to smooth the CAMs adaptively; (ii) Proportional Pseudo-mask Generation to project the expanded CAMs to pseudo-mask based on a new metric indicating the importance of each class on each location; (iii) Pretended Under-Fitting strategy to suppress the influence of noise in pseudo
arXiv Detail & Related papers (2021-08-30T05:35:28Z) - Per-Pixel Classification is Not All You Need for Semantic Segmentation [184.2905747595058]
Mask classification is sufficiently general to solve both semantic- and instance-level segmentation tasks.
We propose MaskFormer, a simple mask classification model which predicts a set of binary masks.
Our method outperforms both current state-of-the-art semantic (55.6 mIoU on ADE20K) and panoptic segmentation (52.7 PQ on COCO) models.
arXiv Detail & Related papers (2021-07-13T17:59:50Z) - Image Inpainting by End-to-End Cascaded Refinement with Mask Awareness [66.55719330810547]
Inpainting arbitrary missing regions is challenging because learning valid features for various masked regions is nontrivial.
We propose a novel mask-aware inpainting solution that learns multi-scale features for missing regions in the encoding phase.
Our framework is validated both quantitatively and qualitatively via extensive experiments on three public datasets.
arXiv Detail & Related papers (2021-04-28T13:17:47Z) - Investigating and Simplifying Masking-based Saliency Methods for Model
Interpretability [5.387323728379395]
Saliency maps that identify the most informative regions of an image are valuable for model interpretability.
A common approach to creating saliency maps involves generating input masks that mask out portions of an image.
We show that a masking model can be trained with as few as 10 examples per class and still generate saliency maps with only a 0.7-point increase in localization error.
arXiv Detail & Related papers (2020-10-19T18:00:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.