Related papers: DMP-3DAD: Cross-Category 3D Anomaly Detection via Realistic Depth Map Projection with Few Normal Samples

DMP-3DAD: Cross-Category 3D Anomaly Detection via Realistic Depth Map Projection with Few Normal Samples

URL: http://arxiv.org/abs/2602.10806v1
Date: Wed, 11 Feb 2026 12:47:38 GMT
Title: DMP-3DAD: Cross-Category 3D Anomaly Detection via Realistic Depth Map Projection with Few Normal Samples
Authors: Zi Wang, Katsuya Hotta, Koichiro Kamide, Yawen Zou, Jianjian Qin, Chao Zhang, Jun Yu,
Abstract summary: Cross-category anomaly detection for 3D point clouds aims to determine whether an unseen object belongs to a target category.<n>Most existing methods rely on category-specific training, which limits their flexibility in few-shot scenarios.<n>DMP-3DAD is a training-free framework for cross-category 3D anomaly detection based on multi-view realistic depth map projection.
Score: 15.21047221062711
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Cross-category anomaly detection for 3D point clouds aims to determine whether an unseen object belongs to a target category using only a few normal examples. Most existing methods rely on category-specific training, which limits their flexibility in few-shot scenarios. In this paper, we propose DMP-3DAD, a training-free framework for cross-category 3D anomaly detection based on multi-view realistic depth map projection. Specifically, by converting point clouds into a fixed set of realistic depth images, our method leverages a frozen CLIP visual encoder to extract multi-view representations and performs anomaly detection via weighted feature similarity, which does not require any fine-tuning or category-dependent adaptation. Extensive experiments on the ShapeNetPart dataset demonstrate that DMP-3DAD achieves state-of-the-art performance under few-shot setting. The results show that the proposed approach provides a simple yet effective solution for practical cross-category 3D anomaly detection.

Related papers

CatFree3D: Category-agnostic 3D Object Detection with Diffusion [63.75470913278591]
We introduce a novel pipeline that decouples 3D detection from 2D detection and depth prediction. We also introduce the Normalised Hungarian Distance (NHD) metric for an accurate evaluation of 3D detection results.
arXiv Detail & Related papers (2024-08-22T22:05:57Z)
CLIP3D-AD: Extending CLIP for 3D Few-Shot Anomaly Detection with Multi-View Images Generation [22.850815902535988]
We propose CLIP3D-AD, an efficient 3D-FSAD method extended on CLIP. We synthesize anomalous images on given normal images as sample pairs to adapt CLIP for 3D anomaly classification and segmentation. Our method has a competitive performance of 3D few-shot anomaly classification and segmentation on MVTec-3D AD dataset.
arXiv Detail & Related papers (2024-06-27T07:13:09Z)
SplatPose & Detect: Pose-Agnostic 3D Anomaly Detection [18.796625355398252]
State-of-the-art algorithms are able to detect defects in increasingly difficult settings and data modalities. We propose the novel 3D Gaussian splatting-based framework SplatPose which accurately estimates the pose of unseen views in a differentiable manner. We achieve state-of-the-art results in both training and inference speed, and detection performance, even when using less training data than competing methods.
arXiv Detail & Related papers (2024-04-10T08:48:09Z)
Towards Unified 3D Object Detection via Algorithm and Data Unification [70.27631528933482]
We build the first unified multi-modal 3D object detection benchmark MM- Omni3D and extend the aforementioned monocular detector to its multi-modal version. We name the designed monocular and multi-modal detectors as UniMODE and MM-UniMODE, respectively.
arXiv Detail & Related papers (2024-02-28T18:59:31Z)
S$^3$-MonoDETR: Supervised Shape&Scale-perceptive Deformable Transformer for Monocular 3D Object Detection [21.96072831561483]
This paper proposes a novel Supervised Shape&Scale-perceptive Deformable Attention'' (S$3$-DA) module for monocular 3D object detection. Benefiting from this, S$3$-DA effectively estimates receptive fields for query points belonging to any category, enabling them to generate robust query features. Experiments on KITTI and Open datasets demonstrate that S$3$-DA significantly improves the detection accuracy.
arXiv Detail & Related papers (2023-09-02T12:36:38Z)
MonoTDP: Twin Depth Perception for Monocular 3D Object Detection in Adverse Scenes [49.21187418886508]
This paper proposes a monocular 3D detection model designed to perceive twin depth in adverse scenes, termed MonoTDP. We first introduce an adaptive learning strategy to aid the model in handling uncontrollable weather conditions, significantly resisting degradation caused by various degrading factors. Then, to address the depth/content loss in adverse regions, we propose a novel twin depth perception module that simultaneously estimates scene and object depth.
arXiv Detail & Related papers (2023-05-18T13:42:02Z)
Scatter Points in Space: 3D Detection from Multi-view Monocular Images [8.71944437852952]
3D object detection from monocular image(s) is a challenging and long-standing problem of computer vision. Recent methods tend to aggregate multiview feature by sampling regular 3D grid densely in space. We propose a learnable keypoints sampling method, which scatters pseudo surface points in 3D space, in order to keep data sparsity.
arXiv Detail & Related papers (2022-08-31T09:38:05Z)
Improving 3D Object Detection with Channel-wise Transformer [58.668922561622466]
We propose a two-stage 3D object detection framework (CT3D) with minimal hand-crafted design. CT3D simultaneously performs proposal-aware embedding and channel-wise context aggregation. It achieves the AP of 81.77% in the moderate car category on the KITTI test 3D detection benchmark.
arXiv Detail & Related papers (2021-08-23T02:03:40Z)
Progressive Coordinate Transforms for Monocular 3D Object Detection [52.00071336733109]
We propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations. In this paper, we propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
arXiv Detail & Related papers (2021-08-12T15:22:33Z)
IAFA: Instance-aware Feature Aggregation for 3D Object Detection from a Single Image [37.83574424518901]
3D object detection from a single image is an important task in Autonomous Driving. We propose an instance-aware approach to aggregate useful information for improving the accuracy of 3D object detection.
arXiv Detail & Related papers (2021-03-05T05:47:52Z)
DOPS: Learning to Detect 3D Objects and Predict their 3D Shapes [54.239416488865565]
We propose a fast single-stage 3D object detection method for LIDAR data. The core novelty of our method is a fast, single-pass architecture that both detects objects in 3D and estimates their shapes. We find that our proposed method achieves state-of-the-art results by 5% on object detection in ScanNet scenes, and it gets top results by 3.4% in the Open dataset.
arXiv Detail & Related papers (2020-04-02T17:48:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.