Related papers: Identifying Systematic Errors in Object Detectors with the SCROD Pipeline

Identifying Systematic Errors in Object Detectors with the SCROD Pipeline

URL: http://arxiv.org/abs/2309.13489v1
Date: Sat, 23 Sep 2023 22:41:08 GMT
Title: Identifying Systematic Errors in Object Detectors with the SCROD Pipeline
Authors: Valentyn Boreiko, Matthias Hein, Jan Hendrik Metzen
Abstract summary: The identification and removal of systematic errors in object detectors can be a prerequisite for their deployment in safety-critical applications. We overcome this limitation by generating synthetic images with fine-granular control. We propose a novel framework that combines the strengths of both approaches.
Score: 46.52729366461028
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The identification and removal of systematic errors in object detectors can be a prerequisite for their deployment in safety-critical applications like automated driving and robotics. Such systematic errors can for instance occur under very specific object poses (location, scale, orientation), object colors/textures, and backgrounds. Real images alone are unlikely to cover all relevant combinations. We overcome this limitation by generating synthetic images with fine-granular control. While generating synthetic images with physical simulators and hand-designed 3D assets allows fine-grained control over generated images, this approach is resource-intensive and has limited scalability. In contrast, using generative models is more scalable but less reliable in terms of fine-grained control. In this paper, we propose a novel framework that combines the strengths of both approaches. Our meticulously designed pipeline along with custom models enables us to generate street scenes with fine-grained control in a fully automated and scalable manner. Moreover, our framework introduces an evaluation setting that can serve as a benchmark for similar pipelines. This evaluation setting will contribute to advancing the field and promoting standardized testing procedures.

Related papers

ARC-Calib: Autonomous Markerless Camera-to-Robot Calibration via Exploratory Robot Motions [15.004750210002152]
ARC-Calib is a model-based markerless camera-to-robot calibration framework. It is fully autonomous and generalizable across diverse robots.
arXiv Detail & Related papers (2025-03-18T20:03:32Z)
Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection [56.66677293607114]
We propose Code-as-Monitor (CaM) for both open-set reactive and proactive failure detection. To enhance the accuracy and efficiency of monitoring, we introduce constraint elements that abstract constraint-related entities. Experiments show that CaM achieves a 28.7% higher success rate and reduces execution time by 31.8% under severe disturbances.
arXiv Detail & Related papers (2024-12-05T18:58:27Z)
OminiControl: Minimal and Universal Control for Diffusion Transformer [68.3243031301164]
OminiControl is a framework that integrates image conditions into pre-trained Diffusion Transformer (DiT) models. At its core, OminiControl leverages a parameter reuse mechanism, enabling the DiT to encode image conditions using itself as a powerful backbone. OminiControl addresses a wide range of image conditioning tasks in a unified manner, including subject-driven generation and spatially-aligned conditions.
arXiv Detail & Related papers (2024-11-22T17:55:15Z)
Generating Compositional Scenes via Text-to-image RGBA Instance Generation [82.63805151691024]
Text-to-image diffusion generative models can generate high quality images at the cost of tedious prompt engineering. We propose a novel multi-stage generation paradigm that is designed for fine-grained control, flexibility and interactivity. Our experiments show that our RGBA diffusion model is capable of generating diverse and high quality instances with precise control over object attributes.
arXiv Detail & Related papers (2024-11-16T23:44:14Z)
Perturb, Attend, Detect and Localize (PADL): Robust Proactive Image Defense [5.150608040339816]
We introduce PADL, a new solution able to generate image-specific perturbations using a symmetric scheme of encoding and decoding based on cross-attention. Our method generalizes to a range of unseen models with diverse architectural designs, such as StarGANv2, BlendGAN, DiffAE, StableDiffusion and StableDiffusionXL.
arXiv Detail & Related papers (2024-09-26T15:16:32Z)
Identification of Fine-grained Systematic Errors via Controlled Scene Generation [41.398080398462994]
We propose a pipeline for generating realistic synthetic scenes with fine-grained control. Our approach, BEV2EGO, allows for a realistic generation of the complete scene with road-contingent control. In addition, we propose a benchmark for controlled scene generation to select the most appropriate generative outpainting model for BEV2EGO.
arXiv Detail & Related papers (2024-04-10T14:35:22Z)
Training-Free Location-Aware Text-to-Image Synthesis [8.503001932363704]
We analyze the generative mechanism of the stable diffusion model and propose a new interactive generation paradigm. Our method outperforms state-of-the-art methods on both control capacity and image quality.
arXiv Detail & Related papers (2023-04-26T10:25:15Z)
RISP: Rendering-Invariant State Predictor with Differentiable Simulation and Rendering for Cross-Domain Parameter Estimation [110.4255414234771]
Existing solutions require massive training data or lack generalizability to unknown rendering configurations. We propose a novel approach that marries domain randomization and differentiable rendering gradients to address this problem. Our approach achieves significantly lower reconstruction errors and has better generalizability among unknown rendering configurations.
arXiv Detail & Related papers (2022-05-11T17:59:51Z)
Multitask AET with Orthogonal Tangent Regularity for Dark Object Detection [84.52197307286681]
We propose a novel multitask auto encoding transformation (MAET) model to enhance object detection in a dark environment. In a self-supervision manner, the MAET learns the intrinsic visual structure by encoding and decoding the realistic illumination-degrading transformation. We have achieved the state-of-the-art performance using synthetic and real-world datasets.
arXiv Detail & Related papers (2022-05-06T16:27:14Z)
Self-Supervised Object Detection via Generative Image Synthesis [106.65384648377349]
We present the first end-to-end analysis-by synthesis framework with controllable GANs for the task of self-supervised object detection. We use collections of real world images without bounding box annotations to learn to synthesize and detect objects. Our work advances the field of self-supervised object detection by introducing a successful new paradigm of using controllable GAN-based image synthesis for it.
arXiv Detail & Related papers (2021-10-19T11:04:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.