Trapped in texture bias? A large scale comparison of deep instance
segmentation
- URL: http://arxiv.org/abs/2401.09109v1
- Date: Wed, 17 Jan 2024 10:21:08 GMT
- Title: Trapped in texture bias? A large scale comparison of deep instance
segmentation
- Authors: Johannes Theodoridis, Jessica Hofmann, Johannes Maucher, Andreas
Schilling
- Abstract summary: We evaluate 68 models on 61 versions of MS COCO for a total of 4148 evaluations.
We find that YOLACT++, SOTR and SOLOv2 are significantly more robust to out-of-distribution texture than other frameworks.
- Score: 4.2603120588176635
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Do deep learning models for instance segmentation generalize to novel objects
in a systematic way? For classification, such behavior has been questioned. In
this study, we aim to understand if certain design decisions such as framework,
architecture or pre-training contribute to the semantic understanding of
instance segmentation. To answer this question, we consider a special case of
robustness and compare pre-trained models on a challenging benchmark for
object-centric, out-of-distribution texture. We do not introduce another method
in this work. Instead, we take a step back and evaluate a broad range of
existing literature. This includes Cascade and Mask R-CNN, Swin Transformer,
BMask, YOLACT(++), DETR, BCNet, SOTR and SOLOv2. We find that YOLACT++, SOTR
and SOLOv2 are significantly more robust to out-of-distribution texture than
other frameworks. In addition, we show that deeper and dynamic architectures
improve robustness whereas training schedules, data augmentation and
pre-training have only a minor impact. In summary we evaluate 68 models on 61
versions of MS COCO for a total of 4148 evaluations.
Related papers
- Part-Based Models Improve Adversarial Robustness [57.699029966800644]
We show that combining human prior knowledge with end-to-end learning can improve the robustness of deep neural networks.
Our model combines a part segmentation model with a tiny classifier and is trained end-to-end to simultaneously segment objects into parts.
Our experiments indicate that these models also reduce texture bias and yield better robustness against common corruptions and spurious correlations.
arXiv Detail & Related papers (2022-09-15T15:41:47Z) - Learning What Not to Segment: A New Perspective on Few-Shot Segmentation [63.910211095033596]
Recently few-shot segmentation (FSS) has been extensively developed.
This paper proposes a fresh and straightforward insight to alleviate the problem.
In light of the unique nature of the proposed approach, we also extend it to a more realistic but challenging setting.
arXiv Detail & Related papers (2022-03-15T03:08:27Z) - Conterfactual Generative Zero-Shot Semantic Segmentation [16.684570608930983]
One of the popular zero-shot semantic segmentation methods is based on the generative model.
In this work, we consider counterfactual methods to avoid the confounder in the original model.
Our model is compared with baseline models on two real-world datasets.
arXiv Detail & Related papers (2021-06-11T13:01:03Z) - Revisiting Contrastive Methods for Unsupervised Learning of Visual
Representations [78.12377360145078]
Contrastive self-supervised learning has outperformed supervised pretraining on many downstream tasks like segmentation and object detection.
In this paper, we first study how biases in the dataset affect existing methods.
We show that current contrastive approaches work surprisingly well across: (i) object- versus scene-centric, (ii) uniform versus long-tailed and (iii) general versus domain-specific datasets.
arXiv Detail & Related papers (2021-06-10T17:59:13Z) - RethinkCWS: Is Chinese Word Segmentation a Solved Task? [81.11161697133095]
The performance of the Chinese Word (CWS) systems has gradually reached a plateau with the rapid development of deep neural networks.
In this paper, we take stock of what we have achieved and rethink what's left in the CWS task.
arXiv Detail & Related papers (2020-11-13T11:07:08Z) - An Analysis of Dataset Overlap on Winograd-Style Tasks [40.27778524078]
We analyze the effects of varying degrees of overlap between training corpora and test instances in WSC-style tasks.
KnowRef-60K is the largest corpus to date for WSC-style common-sense reasoning.
arXiv Detail & Related papers (2020-11-09T21:11:17Z) - Objectness-Aware Few-Shot Semantic Segmentation [31.13009111054977]
We show how to increase overall model capacity to achieve improved performance.
We introduce objectness, which is class-agnostic and so not prone to overfitting.
Given only one annotated example of an unseen category, experiments show that our method outperforms state-of-art methods with respect to mIoU.
arXiv Detail & Related papers (2020-04-06T19:12:08Z) - Learning What to Learn for Video Object Segmentation [157.4154825304324]
We introduce an end-to-end trainable VOS architecture that integrates a differentiable few-shot learning module.
This internal learner is designed to predict a powerful parametric model of the target.
We set a new state-of-the-art on the large-scale YouTube-VOS 2018 dataset by achieving an overall score of 81.5.
arXiv Detail & Related papers (2020-03-25T17:58:43Z) - Learning Fast and Robust Target Models for Video Object Segmentation [83.3382606349118]
Video object segmentation (VOS) is a highly challenging problem since the initial mask, defining the target object, is only given at test-time.
Most previous approaches fine-tune segmentation networks on the first frame, resulting in impractical frame-rates and risk of overfitting.
We propose a novel VOS architecture consisting of two network components.
arXiv Detail & Related papers (2020-02-27T21:58:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.