3DB: A Framework for Debugging Computer Vision Models
- URL: http://arxiv.org/abs/2106.03805v1
- Date: Mon, 7 Jun 2021 17:16:12 GMT
- Title: 3DB: A Framework for Debugging Computer Vision Models
- Authors: Guillaume Leclerc, Hadi Salman, Andrew Ilyas, Sai Vemprala, Logan
Engstrom, Vibhav Vineet, Kai Xiao, Pengchuan Zhang, Shibani Santurkar, Greg
Yang, Ashish Kapoor, Aleksander Madry
- Abstract summary: 3DB allows users to discover vulnerabilities in computer vision systems.
3DB captures and generalizes many analyses from prior work.
We find that the insights generated by the system transfer to the physical world.
- Score: 105.45042148499323
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce 3DB: an extendable, unified framework for testing and debugging
vision models using photorealistic simulation. We demonstrate, through a wide
range of use cases, that 3DB allows users to discover vulnerabilities in
computer vision systems and gain insights into how models make decisions. 3DB
captures and generalizes many robustness analyses from prior work, and enables
one to study their interplay. Finally, we find that the insights generated by
the system transfer to the physical world.
We are releasing 3DB as a library (https://github.com/3db/3db) alongside a
set of example analyses, guides, and documentation: https://3db.github.io/3db/ .
Related papers
- Spot-Compose: A Framework for Open-Vocabulary Object Retrieval and Drawer Manipulation in Point Clouds [45.87961177297602]
This work aims to integrate recent methods into a comprehensive framework for robotic interaction and manipulation in human-centric environments.
Specifically, we leverage 3D reconstructions from a commodity 3D scanner for open-vocabulary instance segmentation.
We show the performance and robustness of our model in two sets of real-world experiments including dynamic object retrieval and drawer opening.
arXiv Detail & Related papers (2024-04-18T18:01:15Z) - Probing the 3D Awareness of Visual Foundation Models [56.68380136809413]
We analyze the 3D awareness of visual foundation models.
We conduct experiments using task-specific probes and zero-shot inference procedures on frozen features.
arXiv Detail & Related papers (2024-04-12T17:58:04Z) - 3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features [70.50665869806188]
3DiffTection is a state-of-the-art method for 3D object detection from single images.
We fine-tune a diffusion model to perform novel view synthesis conditioned on a single image.
We further train the model on target data with detection supervision.
arXiv Detail & Related papers (2023-11-07T23:46:41Z) - SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection [19.75965521357068]
We propose a novel approach called SOGDet (Semantic-Occupancy Guided Multi-view 3D Object Detection) to improve the accuracy of 3D object detection.
Our results show that SOGDet consistently enhance the performance of three baseline methods in terms of nuScenes Detection Score (NDS) and mean Average Precision (mAP)
This indicates that the combination of 3D object detection and 3D semantic occupancy leads to a more comprehensive perception of the 3D environment, thereby aiding build more robust autonomous driving systems.
arXiv Detail & Related papers (2023-08-26T07:38:21Z) - AutoDecoding Latent 3D Diffusion Models [95.7279510847827]
We present a novel approach to the generation of static and articulated 3D assets that has a 3D autodecoder at its core.
The 3D autodecoder framework embeds properties learned from the target dataset in the latent space.
We then identify the appropriate intermediate volumetric latent space, and introduce robust normalization and de-normalization operations.
arXiv Detail & Related papers (2023-07-07T17:59:14Z) - SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model [59.04877271899894]
This paper explores adapting the zero-shot ability of SAM to 3D object detection in this paper.
We propose a SAM-powered BEV processing pipeline to detect objects and get promising results on the large-scale open dataset.
arXiv Detail & Related papers (2023-06-04T03:09:21Z) - 3D-LatentMapper: View Agnostic Single-View Reconstruction of 3D Shapes [0.0]
We propose a novel framework that leverages the intermediate latent spaces of Vision Transformer (ViT) and a joint image-text representational model, CLIP, for fast and efficient Single View Reconstruction (SVR)
We use the ShapeNetV2 dataset and perform extensive experiments with comparisons to SOTA methods to demonstrate our method's effectiveness.
arXiv Detail & Related papers (2022-12-05T11:45:26Z) - Survey and Systematization of 3D Object Detection Models and Methods [3.472931603805115]
We provide a comprehensive survey of recent developments from 2012-2021 in 3D object detection.
We introduce fundamental concepts, focus on a broad range of different approaches that have emerged over the past decade.
We propose a systematization that provides a practical framework for comparing these approaches with the goal of guiding future development, evaluation and application activities.
arXiv Detail & Related papers (2022-01-23T20:06:07Z) - BANMo: Building Animatable 3D Neural Models from Many Casual Videos [135.64291166057373]
We present BANMo, a method that requires neither a specialized sensor nor a pre-defined template shape.
Banmo builds high-fidelity, articulated 3D models from many monocular casual videos in a differentiable rendering framework.
On real and synthetic datasets, BANMo shows higher-fidelity 3D reconstructions than prior works for humans and animals.
arXiv Detail & Related papers (2021-12-23T18:30:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.