Cheating Depth: Enhancing 3D Surface Anomaly Detection via Depth
Simulation
- URL: http://arxiv.org/abs/2311.01117v1
- Date: Thu, 2 Nov 2023 09:44:21 GMT
- Title: Cheating Depth: Enhancing 3D Surface Anomaly Detection via Depth
Simulation
- Authors: Vitjan Zavrtanik, Matej Kristan, Danijel Sko\v{c}aj
- Abstract summary: RGB-based surface anomaly detection methods have advanced significantly.
Certain surface anomalies remain practically invisible in RGB alone, necessitating the incorporation of 3D information.
Re-training RGB backbones on industrial depth datasets is hindered by the limited availability of sufficiently large datasets.
We propose a new surface anomaly detection method 3DSR, which outperforms all existing state-of-the-art on the challenging MVTec3D anomaly detection benchmark.
- Score: 12.843938169660404
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: RGB-based surface anomaly detection methods have advanced significantly.
However, certain surface anomalies remain practically invisible in RGB alone,
necessitating the incorporation of 3D information. Existing approaches that
employ point-cloud backbones suffer from suboptimal representations and reduced
applicability due to slow processing. Re-training RGB backbones, designed for
faster dense input processing, on industrial depth datasets is hindered by the
limited availability of sufficiently large datasets. We make several
contributions to address these challenges. (i) We propose a novel Depth-Aware
Discrete Autoencoder (DADA) architecture, that enables learning a general
discrete latent space that jointly models RGB and 3D data for 3D surface
anomaly detection. (ii) We tackle the lack of diverse industrial depth datasets
by introducing a simulation process for learning informative depth features in
the depth encoder. (iii) We propose a new surface anomaly detection method
3DSR, which outperforms all existing state-of-the-art on the challenging
MVTec3D anomaly detection benchmark, both in terms of accuracy and processing
speed. The experimental results validate the effectiveness and efficiency of
our approach, highlighting the potential of utilizing depth information for
improved surface anomaly detection.
Related papers
- 3D Harmonic Loss: Towards Task-consistent and Time-friendly 3D Object
Detection on Edge for Intelligent Transportation System [28.55894241049706]
We propose a 3D harmonic loss function to relieve the pointcloud based inconsistent predictions.
Our proposed method considerably improves the performance than benchmark models.
Our code is open-source and publicly available.
arXiv Detail & Related papers (2022-11-07T10:11:48Z) - Depth Estimation Matters Most: Improving Per-Object Depth Estimation for
Monocular 3D Detection and Tracking [47.59619420444781]
Approaches to monocular 3D perception including detection and tracking often yield inferior performance when compared to LiDAR-based techniques.
We propose a multi-level fusion method that combines different representations (RGB and pseudo-LiDAR) and temporal information across multiple frames for objects (tracklets) to enhance per-object depth estimation.
arXiv Detail & Related papers (2022-06-08T03:37:59Z) - HiMODE: A Hybrid Monocular Omnidirectional Depth Estimation Model [3.5290359800552946]
HiMODE is a novel monocular omnidirectional depth estimation model based on a CNN+Transformer architecture.
We show that HiMODE can achieve state-of-the-art performance for 360deg monocular depth estimation.
arXiv Detail & Related papers (2022-04-11T11:11:43Z) - Joint Learning of Salient Object Detection, Depth Estimation and Contour
Extraction [91.43066633305662]
We propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD)
Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks.
Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time.
arXiv Detail & Related papers (2022-03-09T17:20:18Z) - Consistent Depth Prediction under Various Illuminations using Dilated
Cross Attention [1.332560004325655]
We propose to use internet 3D indoor scenes and manually tune their illuminations to render photo-realistic RGB photos and their corresponding depth and BRDF maps.
We perform cross attention on these dilated features to retain the consistency of depth prediction under different illuminations.
Our method is evaluated by comparing it with current state-of-the-art methods on Vari dataset and a significant improvement is observed in experiments.
arXiv Detail & Related papers (2021-12-15T10:02:46Z) - Sparse Depth Completion with Semantic Mesh Deformation Optimization [4.03103540543081]
We propose a neural network with post-optimization, which takes an RGB image and sparse depth samples as input and predicts the complete depth map.
Our evaluation results outperform the existing work consistently on both indoor and outdoor datasets.
arXiv Detail & Related papers (2021-12-10T13:01:06Z) - Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust
Depth Prediction [87.08227378010874]
We show the importance of the high-order 3D geometric constraints for depth prediction.
By designing a loss term that enforces a simple geometric constraint, we significantly improve the accuracy and robustness of monocular depth estimation.
We show state-of-the-art results of learning metric depth on NYU Depth-V2 and KITTI.
arXiv Detail & Related papers (2021-03-07T00:08:21Z) - PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space.
We propose a model that unifies these two tasks in the same metric space.
Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z) - 3D Dense Geometry-Guided Facial Expression Synthesis by Adversarial
Learning [54.24887282693925]
We propose a novel framework to exploit 3D dense (depth and surface normals) information for expression manipulation.
We use an off-the-shelf state-of-the-art 3D reconstruction model to estimate the depth and create a large-scale RGB-Depth dataset.
Our experiments demonstrate that the proposed method outperforms the competitive baseline and existing arts by a large margin.
arXiv Detail & Related papers (2020-09-30T17:12:35Z) - Accurate RGB-D Salient Object Detection via Collaborative Learning [101.82654054191443]
RGB-D saliency detection shows impressive ability on some challenge scenarios.
We propose a novel collaborative learning framework where edge, depth and saliency are leveraged in a more efficient way.
arXiv Detail & Related papers (2020-07-23T04:33:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.