Are Synthetic Corruptions A Reliable Proxy For Real-World Corruptions?
- URL: http://arxiv.org/abs/2505.04835v1
- Date: Wed, 07 May 2025 22:19:55 GMT
- Title: Are Synthetic Corruptions A Reliable Proxy For Real-World Corruptions?
- Authors: Shashank Agnihotri, David Schader, Nico Sharei, Mehmet Ege Kaçar, Margret Keuper,
- Abstract summary: We conduct the largest benchmarking study on semantic segmentation models.<n>We compare performance on real-world corruptions and synthetic corruptions datasets.<n>We analyze corruption-specific correlations, providing key insights to understand when synthetic corruptions succeed in representing real-world corruptions.
- Score: 11.35081321966394
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning (DL) models are widely used in real-world applications but remain vulnerable to distribution shifts, especially due to weather and lighting changes. Collecting diverse real-world data for testing the robustness of DL models is resource-intensive, making synthetic corruptions an attractive alternative for robustness testing. However, are synthetic corruptions a reliable proxy for real-world corruptions? To answer this, we conduct the largest benchmarking study on semantic segmentation models, comparing performance on real-world corruptions and synthetic corruptions datasets. Our results reveal a strong correlation in mean performance, supporting the use of synthetic corruptions for robustness evaluation. We further analyze corruption-specific correlations, providing key insights to understand when synthetic corruptions succeed in representing real-world corruptions. Open-source Code: https://github.com/shashankskagnihotri/benchmarking_robustness/tree/segmentation_david/semantic_segm entation
Related papers
- DispBench: Benchmarking Disparity Estimation to Synthetic Corruptions [11.35081321966394]
Deep learning (DL) has surpassed human performance on standard benchmarks, driving its widespread adoption in computer vision tasks.<n>DispBench is a comprehensive benchmarking tool for systematically assessing the reliability of disparity estimation methods.<n>We conduct the most extensive performance and robustness analysis of disparity estimation methods to date, uncovering key correlations between accuracy, reliability, and generalization.
arXiv Detail & Related papers (2025-05-08T09:40:17Z) - Benchmarking the Spatial Robustness of DNNs via Natural and Adversarial Localized Corruptions [49.546479320670464]
This paper introduces specialized metrics for benchmarking the spatial robustness of segmentation models.<n>We propose region-aware multi-attack adversarial analysis, a method that enables a deeper understanding of model robustness.<n>The results reveal that models respond to these two types of threats differently.
arXiv Detail & Related papers (2025-04-02T11:37:39Z) - Reliability in Semantic Segmentation: Can We Use Synthetic Data? [69.28268603137546]
We show for the first time how synthetic data can be specifically generated to assess comprehensively the real-world reliability of semantic segmentation models.
This synthetic data is employed to evaluate the robustness of pretrained segmenters.
We demonstrate how our approach can be utilized to enhance the calibration and OOD detection capabilities of segmenters.
arXiv Detail & Related papers (2023-12-14T18:56:07Z) - A Discrepancy Aware Framework for Robust Anomaly Detection [51.710249807397695]
We present a Discrepancy Aware Framework (DAF), which demonstrates robust performance consistently with simple and cheap strategies.
Our method leverages an appearance-agnostic cue to guide the decoder in identifying defects, thereby alleviating its reliance on synthetic appearance.
Under the simple synthesis strategies, it outperforms existing methods by a large margin. Furthermore, it also achieves the state-of-the-art localization performance.
arXiv Detail & Related papers (2023-10-11T15:21:40Z) - Frequency-Based Vulnerability Analysis of Deep Learning Models against
Image Corruptions [48.34142457385199]
We present MUFIA, an algorithm designed to identify the specific types of corruptions that can cause models to fail.
We find that even state-of-the-art models trained to be robust against known common corruptions struggle against the low visibility-based corruptions crafted by MUFIA.
arXiv Detail & Related papers (2023-06-12T15:19:13Z) - Investigating the Corruption Robustness of Image Classifiers with Random
Lp-norm Corruptions [3.1337872355726084]
This study investigates the use of random p-norm corruptions to augment the training and test data of image classifiers.
We find that training data augmentation with a combination of p-norm corruptions significantly improves corruption robustness, even on top of state-of-the-art data augmentation schemes.
arXiv Detail & Related papers (2023-05-09T12:45:43Z) - Robo3D: Towards Robust and Reliable 3D Perception against Corruptions [58.306694836881235]
We present Robo3D, the first comprehensive benchmark heading toward probing the robustness of 3D detectors and segmentors under out-of-distribution scenarios.
We consider eight corruption types stemming from severe weather conditions, external disturbances, and internal sensor failure.
We propose a density-insensitive training framework along with a simple flexible voxelization strategy to enhance the model resiliency.
arXiv Detail & Related papers (2023-03-30T17:59:17Z) - Using Synthetic Corruptions to Measure Robustness to Natural
Distribution Shifts [6.445605125467574]
We propose a methodology to build synthetic corruption benchmarks that make robustness estimations more correlated with robustness to real-world distribution shifts.
Applying the proposed methodology, we build a new benchmark called ImageNet-Syn2Nat to predict image classifier robustness.
arXiv Detail & Related papers (2021-07-26T09:20:49Z) - Using the Overlapping Score to Improve Corruption Benchmarks [6.445605125467574]
We propose a metric called corruption overlapping score, which can be used to reveal flaws in corruption benchmarks.
We argue that taking into account overlappings between corruptions can help to improve existing benchmarks or build better ones.
arXiv Detail & Related papers (2021-05-26T06:42:54Z) - On Interaction Between Augmentations and Corruptions in Natural
Corruption Robustness [78.6626755563546]
Several new data augmentations have been proposed that significantly improve performance on ImageNet-C.
We develop a new measure in this space between augmentations and corruptions called the Minimal Sample Distance to demonstrate there is a strong correlation between similarity and performance.
We observe a significant degradation in corruption robustness when the test-time corruptions are sampled to be perceptually dissimilar from ImageNet-C.
Our results suggest that test error can be improved by training on perceptually similar augmentations, and data augmentations may not generalize well beyond the existing benchmark.
arXiv Detail & Related papers (2021-02-22T18:58:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.