OOD-CV-v2: An extended Benchmark for Robustness to Out-of-Distribution
Shifts of Individual Nuisances in Natural Images
- URL: http://arxiv.org/abs/2304.10266v2
- Date: Wed, 26 Jul 2023 18:01:25 GMT
- Title: OOD-CV-v2: An extended Benchmark for Robustness to Out-of-Distribution
Shifts of Individual Nuisances in Natural Images
- Authors: Bingchen Zhao, Jiahao Wang, Wufei Ma, Artur Jesslen, Siwei Yang,
Shaozuo Yu, Oliver Zendel, Christian Theobalt, Alan Yuille, Adam Kortylewski
- Abstract summary: OOD-CV-v2 is a benchmark dataset that includes out-of-distribution examples of 10 object categories in terms of pose, shape, texture, context and the weather conditions.
In addition to this novel dataset, we contribute extensive experiments using popular baseline methods.
- Score: 59.51657161097337
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Enhancing the robustness of vision algorithms in real-world scenarios is
challenging. One reason is that existing robustness benchmarks are limited, as
they either rely on synthetic data or ignore the effects of individual nuisance
factors. We introduce OOD-CV-v2, a benchmark dataset that includes
out-of-distribution examples of 10 object categories in terms of pose, shape,
texture, context and the weather conditions, and enables benchmarking of models
for image classification, object detection, and 3D pose estimation. In addition
to this novel dataset, we contribute extensive experiments using popular
baseline methods, which reveal that: 1) Some nuisance factors have a much
stronger negative effect on the performance compared to others, also depending
on the vision task. 2) Current approaches to enhance robustness have only
marginal effects, and can even reduce robustness. 3) We do not observe
significant differences between convolutional and transformer architectures. We
believe our dataset provides a rich test bed to study robustness and will help
push forward research in this area.
Our dataset can be accessed from https://bzhao.me/OOD-CV/
Related papers
- PoseBench: Benchmarking the Robustness of Pose Estimation Models under Corruptions [57.871692507044344]
Pose estimation aims to accurately identify anatomical keypoints in humans and animals using monocular images.
Current models are typically trained and tested on clean data, potentially overlooking the corruption during real-world deployment.
We introduce PoseBench, a benchmark designed to evaluate the robustness of pose estimation models against real-world corruption.
arXiv Detail & Related papers (2024-06-20T14:40:17Z) - Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving [55.93813178692077]
We present RoboBEV, an extensive benchmark suite designed to evaluate the resilience of BEV algorithms.
We assess 33 state-of-the-art BEV-based perception models spanning tasks like detection, map segmentation, depth estimation, and occupancy prediction.
Our experimental results also underline the efficacy of strategies like pre-training and depth-free BEV transformations in enhancing robustness against out-of-distribution data.
arXiv Detail & Related papers (2024-05-27T17:59:39Z) - When hard negative sampling meets supervised contrastive learning [17.173114048398947]
We introduce a new supervised contrastive learning objective, SCHaNe, which incorporates hard negative sampling during the fine-tuning phase.
SchaNe outperforms the strong baseline BEiT-3 in Top-1 accuracy across various benchmarks.
Our proposed objective sets a new state-of-the-art for base models on ImageNet-1k, achieving an 86.14% accuracy.
arXiv Detail & Related papers (2023-08-28T20:30:10Z) - Mind the Backbone: Minimizing Backbone Distortion for Robust Object
Detection [52.355018626115346]
Building object detectors that are robust to domain shifts is critical for real-world applications.
We propose to use Relative Gradient Norm as a way to measure the vulnerability of a backbone to feature distortion.
We present recipes to boost OOD robustness for both types of backbones.
arXiv Detail & Related papers (2023-03-26T14:50:43Z) - Benchmarking and Analyzing 3D Human Pose and Shape Estimation Beyond
Algorithms [31.2529724533643]
This work presents the first comprehensive benchmarking study from three under-explored perspectives beyond algorithms.
An analysis on 31 datasets reveals the distinct impacts of data samples.
We achieve a PA-MPJPE of 47.3 mm on the 3DPW test set with a relatively simple model.
arXiv Detail & Related papers (2022-09-21T17:39:53Z) - ROBIN : A Benchmark for Robustness to Individual Nuisances in Real-World
Out-of-Distribution Shifts [12.825391710803894]
ROBIN is a benchmark dataset for diagnosing the robustness of vision algorithms to individual nuisances in real-world images.
ROBIN builds on 10 rigid categories from the PASCAL VOC 2012 and ImageNet datasets.
We provide results for a number of popular baselines and make several interesting observations.
arXiv Detail & Related papers (2021-11-29T06:18:46Z) - Contemplating real-world object classification [53.10151901863263]
We reanalyze the ObjectNet dataset recently proposed by Barbu et al. containing objects in daily life situations.
We find that applying deep models to the isolated objects, rather than the entire scene as is done in the original paper, results in around 20-30% performance improvement.
arXiv Detail & Related papers (2021-03-08T23:29:59Z) - Secrets of 3D Implicit Object Shape Reconstruction in the Wild [92.5554695397653]
Reconstructing high-fidelity 3D objects from sparse, partial observation is crucial for various applications in computer vision, robotics, and graphics.
Recent neural implicit modeling methods show promising results on synthetic or dense datasets.
But, they perform poorly on real-world data that is sparse and noisy.
This paper analyzes the root cause of such deficient performance of a popular neural implicit model.
arXiv Detail & Related papers (2021-01-18T03:24:48Z) - Point Transformer for Shape Classification and Retrieval of 3D and ALS
Roof PointClouds [3.3744638598036123]
This paper proposes a fully attentional model - em Point Transformer, for deriving a rich point cloud representation.
The model's shape classification and retrieval performance are evaluated on a large-scale urban dataset - RoofN3D and a standard benchmark dataset ModelNet40.
The proposed method outperforms other state-of-the-art models in the RoofN3D dataset, gives competitive results in the ModelNet40 benchmark, and showcases high robustness to various unseen point corruptions.
arXiv Detail & Related papers (2020-11-08T08:11:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.