Benchmarking the Robustness of Instance Segmentation Models
- URL: http://arxiv.org/abs/2109.01123v1
- Date: Thu, 2 Sep 2021 17:50:07 GMT
- Title: Benchmarking the Robustness of Instance Segmentation Models
- Authors: Said Fahri Altindis, Yusuf Dalva, and Aysegul Dundar
- Abstract summary: This paper presents a comprehensive evaluation of instance segmentation models with respect to real-world image corruptions and out-of-domain image collections.
The out-of-domain image evaluation shows the generalization capability of models, an essential aspect of real-world applications.
Specifically, this benchmark study includes state-of-the-art network architectures, network backbones, normalization layers, models trained starting from scratch or ImageNet pretrained networks.
- Score: 3.1287804585804073
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a comprehensive evaluation of instance segmentation
models with respect to real-world image corruptions and out-of-domain image
collections, e.g. datasets collected with different set-ups than the training
datasets the models learned from. The out-of-domain image evaluation shows the
generalization capability of models, an essential aspect of real-world
applications, and an extensively studied topic of domain adaptation. These
presented robustness and generalization evaluations are important when
designing instance segmentation models for real-world applications and picking
an off-the-shelf pretrained model to directly use for the task at hand.
Specifically, this benchmark study includes state-of-the-art network
architectures, network backbones, normalization layers, models trained starting
from scratch or ImageNet pretrained networks, and the effect of multi-task
training on robustness and generalization. Through this study, we gain several
insights e.g. we find that normalization layers play an essential role in
robustness, ImageNet pretraining does not help the robustness and the
generalization of models, excluding JPEG corruption, and network backbones and
copy-paste augmentations affect robustness significantly.
Related papers
- In Search of Forgotten Domain Generalization [20.26519807919284]
Out-of-Domain (OOD) generalization is the ability of a model trained on one or more domains to generalize to unseen domains.
In the ImageNet era of computer vision, evaluation sets for measuring a model's OOD performance were designed to be strictly OOD with respect to style.
The emergence of foundation models and expansive web-scale datasets has obfuscated this evaluation process.
arXiv Detail & Related papers (2024-10-10T17:50:45Z) - Zero-Shot Object-Centric Representation Learning [72.43369950684057]
We study current object-centric methods through the lens of zero-shot generalization.
We introduce a benchmark comprising eight different synthetic and real-world datasets.
We find that training on diverse real-world images improves transferability to unseen scenarios.
arXiv Detail & Related papers (2024-08-17T10:37:07Z) - Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining [73.81862342673894]
Foundation models have reshaped the landscape of Remote Sensing (RS) by enhancing various image interpretation tasks.
transferring the pretrained models to downstream tasks may encounter task discrepancy due to their formulation of pretraining as image classification or object discrimination tasks.
We conduct multi-task supervised pretraining on the SAMRS dataset, encompassing semantic segmentation, instance segmentation, and rotated object detection.
Our models are finetuned on various RS downstream tasks, such as scene classification, horizontal and rotated object detection, semantic segmentation, and change detection.
arXiv Detail & Related papers (2024-03-20T09:17:22Z) - Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study [61.65123150513683]
multimodal foundation models, such as CLIP, produce state-of-the-art zero-shot results.
It is reported that these models close the robustness gap by matching the performance of supervised models trained on ImageNet.
We show that CLIP leads to a significant robustness drop compared to supervised ImageNet models on our benchmark.
arXiv Detail & Related papers (2024-03-15T17:33:49Z) - Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems.
Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored.
We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z) - Rethinking Self-Supervision Objectives for Generalizable Coherence
Modeling [8.329870357145927]
Coherence evaluation of machine generated text is one of the principal applications of coherence models that needs to be investigated.
We explore training data and self-supervision objectives that result in a model that generalizes well across tasks.
We show empirically that increasing the density of negative samples improves the basic model, and using a global negative queue further improves and stabilizes the model while training with hard negative samples.
arXiv Detail & Related papers (2021-10-14T07:44:14Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z) - Generative Interventions for Causal Learning [27.371436971655303]
We introduce a framework for learning robust visual representations that generalize to new viewpoints, backgrounds, and scene contexts.
We show that we can steer generative models to manufacture interventions on features caused by confounding factors.
arXiv Detail & Related papers (2020-12-22T16:01:55Z) - Multi-task pre-training of deep neural networks for digital pathology [8.74883469030132]
We first assemble and transform many digital pathology datasets into a pool of 22 classification tasks and almost 900k images.
We show that our models used as feature extractors either improve significantly over ImageNet pre-trained models or provide comparable performance.
arXiv Detail & Related papers (2020-05-05T08:50:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.