MC3D-AD: A Unified Geometry-aware Reconstruction Model for Multi-category 3D Anomaly Detection
- URL: http://arxiv.org/abs/2505.01969v2
- Date: Thu, 24 Jul 2025 16:03:56 GMT
- Title: MC3D-AD: A Unified Geometry-aware Reconstruction Model for Multi-category 3D Anomaly Detection
- Authors: Jiayi Cheng, Can Gao, Jie Zhou, Jiajun Wen, Tao Dai, Jinbao Wang,
- Abstract summary: This paper presents a novel unified model for Multi-Category 3D Anomaly Detection (MC3D-AD)<n>It aims to utilize both local and global geometry-aware information to reconstruct normal representations of all categories.<n>MC3D-AD is evaluated on two publicly available Real3D-AD and Anomaly-ShapeNet datasets.
- Score: 33.46875410103838
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D Anomaly Detection (AD) is a promising means of controlling the quality of manufactured products. However, existing methods typically require carefully training a task-specific model for each category independently, leading to high cost, low efficiency, and weak generalization. Therefore, this paper presents a novel unified model for Multi-Category 3D Anomaly Detection (MC3D-AD) that aims to utilize both local and global geometry-aware information to reconstruct normal representations of all categories. First, to learn robust and generalized features of different categories, we propose an adaptive geometry-aware masked attention module that extracts geometry variation information to guide mask attention. Then, we introduce a local geometry-aware encoder reinforced by the improved mask attention to encode group-level feature tokens. Finally, we design a global query decoder that utilizes point cloud position embeddings to improve the decoding process and reconstruction ability. This leads to local and global geometry-aware reconstructed feature tokens for the AD task. MC3D-AD is evaluated on two publicly available Real3D-AD and Anomaly-ShapeNet datasets, and exhibits significant superiority over current state-of-the-art single-category methods, achieving 3.1\% and 9.3\% improvement in object-level AUROC over Real3D-AD and Anomaly-ShapeNet, respectively. The code is available at https://github.com/iCAN-SZU/MC3D-AD.
Related papers
- C3D-AD: Toward Continual 3D Anomaly Detection via Kernel Attention with Learnable Advisor [9.917394249928092]
3D Anomaly Detection (AD) has shown great potential in detecting anomalies or defects of high-precision industrial products.<n>Existing methods are typically trained in a class-specific manner and also lack the capability of learning from emerging classes.<n>We propose Continual 3D Anomaly Detection (C3D-AD), which can not only learn generalized representations for multi-class point clouds but also handle new classes emerging over time.
arXiv Detail & Related papers (2025-08-02T10:54:55Z) - TrackAny3D: Transferring Pretrained 3D Models for Category-unified 3D Point Cloud Tracking [25.788917457593673]
TrackAny3D is the first framework to transfer large-scale pretrained 3D models for category-agnostic 3D SOT.<n>MoGE architecture adaptively activates specialized 3works based on distinct geometric characteristics.<n>Experiments show that TrackAny3D establishes new state-of-the-art performance on category-agnostic 3D SOT.
arXiv Detail & Related papers (2025-07-26T10:41:55Z) - Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries.
We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images.
Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z) - GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation [65.33726478659304]
We introduce the Geometry-Aware Large Reconstruction Model (GeoLRM), an approach which can predict high-quality assets with 512k Gaussians and 21 input images in only 11 GB GPU memory.
Previous works neglect the inherent sparsity of 3D structure and do not utilize explicit geometric relationships between 3D and 2D images.
GeoLRM tackles these issues by incorporating a novel 3D-aware transformer structure that directly processes 3D points and uses deformable cross-attention mechanisms.
arXiv Detail & Related papers (2024-06-21T17:49:31Z) - Segment Any 3D Object with Language [58.471327490684295]
We introduce Segment any 3D Object with LanguagE (SOLE), a semantic geometric and-aware visual-language learning framework with strong generalizability.
Specifically, we propose a multimodal fusion network to incorporate multimodal semantics in both backbone and decoder.
Our SOLE outperforms previous methods by a large margin on ScanNetv2, ScanNet200, and Replica benchmarks.
arXiv Detail & Related papers (2024-04-02T17:59:10Z) - SAI3D: Segment Any Instance in 3D Scenes [68.57002591841034]
We introduce SAI3D, a novel zero-shot 3D instance segmentation approach.
Our method partitions a 3D scene into geometric primitives, which are then progressively merged into 3D instance segmentations.
Empirical evaluations on ScanNet, Matterport3D and the more challenging ScanNet++ datasets demonstrate the superiority of our approach.
arXiv Detail & Related papers (2023-12-17T09:05:47Z) - S$^3$-MonoDETR: Supervised Shape&Scale-perceptive Deformable Transformer for Monocular 3D Object Detection [21.96072831561483]
This paper proposes a novel Supervised Shape&Scale-perceptive Deformable Attention'' (S$3$-DA) module for monocular 3D object detection.
Benefiting from this, S$3$-DA effectively estimates receptive fields for query points belonging to any category, enabling them to generate robust query features.
Experiments on KITTI and Open datasets demonstrate that S$3$-DA significantly improves the detection accuracy.
arXiv Detail & Related papers (2023-09-02T12:36:38Z) - Hierarchical Point Attention for Indoor 3D Object Detection [111.04397308495618]
This work proposes two novel attention operations as generic hierarchical designs for point-based transformer detectors.
First, we propose Multi-Scale Attention (MS-A) that builds multi-scale tokens from a single-scale input feature to enable more fine-grained feature learning.
Second, we propose Size-Adaptive Local Attention (Local-A) with adaptive attention regions for localized feature aggregation within bounding box proposals.
arXiv Detail & Related papers (2023-01-06T18:52:12Z) - CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds [55.44204039410225]
We present a novel two-stage fully sparse convolutional 3D object detection framework, named CAGroup3D.
Our proposed method first generates some high-quality 3D proposals by leveraging the class-aware local group strategy on the object surface voxels.
To recover the features of missed voxels due to incorrect voxel-wise segmentation, we build a fully sparse convolutional RoI pooling module.
arXiv Detail & Related papers (2022-10-09T13:38:48Z) - 3D Unsupervised Region-Aware Registration Transformer [13.137287695912633]
Learning robust point cloud registration models with deep neural networks has emerged as a powerful paradigm.
We propose a new design of 3D region partition module that is able to divide the input shape to different regions with a self-supervised 3D shape reconstruction loss.
Our experiments show that our 3D-URRT achieves superior registration performance over various benchmark datasets.
arXiv Detail & Related papers (2021-10-07T15:06:52Z) - I3DOL: Incremental 3D Object Learning without Catastrophic Forgetting [38.7610646073842]
I3DOL is first exploration to learn new classes of 3D object continually.
An adaptive-geometric centroid module is designed to construct discriminative local geometric structures.
A geometric-aware attention mechanism is developed to quantify the contributions of local geometric structures.
arXiv Detail & Related papers (2020-12-16T15:17:51Z) - Fine-Grained 3D Shape Classification with Hierarchical Part-View
Attentions [70.0171362989609]
We propose a novel fine-grained 3D shape classification method named FG3D-Net to capture the fine-grained local details of 3D shapes from multiple rendered views.
Our results under the fine-grained 3D shape dataset show that our method outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2020-05-26T06:53:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.