Related papers: Simplified Learning of CAD Features Leveraging a Deep Residual Autoencoder

Simplified Learning of CAD Features Leveraging a Deep Residual Autoencoder

URL: http://arxiv.org/abs/2202.10099v1
Date: Mon, 21 Feb 2022 10:27:55 GMT
Title: Simplified Learning of CAD Features Leveraging a Deep Residual Autoencoder
Authors: Raoul Sch\"onhof and Jannes Elstner and Radu Manea and Steffen Tauber and Ramez Awad and Marco F. Huber
Abstract summary: In computer vision, deep residual neural networks like EfficientNet have set new standards in terms of robustness and accuracy. One key problem underlying the training of deep neural networks is the immanent lack of a sufficient amount of training data. We present a deep residual 3D autoencoder based on the EfficientNet architecture, intended for transfer learning tasks related to 3D CAD model assessment.
Score: 3.567248644184455
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: In the domain of computer vision, deep residual neural networks like EfficientNet have set new standards in terms of robustness and accuracy. One key problem underlying the training of deep neural networks is the immanent lack of a sufficient amount of training data. The problem worsens especially if labels cannot be generated automatically, but have to be annotated manually. This challenge occurs for instance if expert knowledge related to 3D parts should be externalized based on example models. One way to reduce the necessary amount of labeled data may be the use of autoencoders, which can be learned in an unsupervised fashion without labeled data. In this work, we present a deep residual 3D autoencoder based on the EfficientNet architecture, intended for transfer learning tasks related to 3D CAD model assessment. For this purpose, we adopted EfficientNet to 3D problems like voxel models derived from a STEP file. Striving to reduce the amount of labeled 3D data required, the networks encoder can be utilized for transfer training.

Related papers

Leveraging Automatic CAD Annotations for Supervised Learning in 3D Scene Understanding [29.147693306652414]
We show that data generated by automatic retrieval of synthetic CAD models can be used as high-quality ground truth for training supervised deep learning models. Our results underscore the potential of automatic 3D annotations to enhance model performance while significantly reducing annotation costs. To support future research in 3D scene understanding, we will release our annotations, which we call SCANnotate++, along with our trained models.
arXiv Detail & Related papers (2025-04-18T09:33:45Z)
ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding [51.509115746992165]
We introduce ARKit LabelMaker, the first large-scale, real-world 3D dataset with dense semantic annotations. We also push forward the state-of-the-art performance on ScanNet and ScanNet200 dataset with prevalent 3D semantic segmentation models.
arXiv Detail & Related papers (2024-10-17T14:44:35Z)
P3P: Pseudo-3D Pre-training for Scaling 3D Masked Autoencoders [32.85484320025852]
We propose a novel self-supervised pre-training framework utilizing the real 3D data and the pseudo-3D data lifted from images by a large depth estimation model. Our method achieves state-of-the-art performance in 3D classification and few-shot learning while maintaining high pre-training and downstream fine-tuning efficiency.
arXiv Detail & Related papers (2024-08-19T13:59:53Z)
Let Me DeCode You: Decoder Conditioning with Tabular Data [0.15487122608774898]
We introduce a novel approach, DeCode, that utilizes label-derived features for model conditioning to support the decoder in the reconstruction process dynamically. DeCode focuses on improving 3D segmentation performance through the incorporation of conditioning embedding with learned numerical representation of 3D-label shape features. Our results show that DeCode significantly outperforms traditional, unconditioned models in terms of generalization to unseen data, achieving higher accuracy at a reduced computational cost.
arXiv Detail & Related papers (2024-07-12T17:14:33Z)
FILP-3D: Enhancing 3D Few-shot Class-incremental Learning with Pre-trained Vision-Language Models [62.663113296987085]
Few-shot class-incremental learning aims to mitigate the catastrophic forgetting issue when a model is incrementally trained on limited data. We introduce two novel components: the Redundant Feature Eliminator (RFE) and the Spatial Noise Compensator (SNC) Considering the imbalance in existing 3D datasets, we also propose new evaluation metrics that offer a more nuanced assessment of a 3D FSCIL model.
arXiv Detail & Related papers (2023-12-28T14:52:07Z)
DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields [68.94868475824575]
This paper introduces a novel approach capable of generating infinite, high-quality 3D-consistent 2D annotations alongside 3D point cloud segmentations. We leverage the strong semantic prior within a 3D generative model to train a semantic decoder. Once trained, the decoder efficiently generalizes across the latent space, enabling the generation of infinite data.
arXiv Detail & Related papers (2023-11-18T21:58:28Z)
AutoSynth: Learning to Generate 3D Training Data for Object Point Cloud Registration [69.21282992341007]
Auto Synth automatically generates 3D training data for point cloud registration. We replace the point cloud registration network with a much smaller surrogate network, leading to a $4056.43$ speedup. Our results on TUD-L, LINEMOD and Occluded-LINEMOD evidence that a neural network trained on our searched dataset yields consistently better performance than the same one trained on the widely used ModelNet40 dataset.
arXiv Detail & Related papers (2023-09-20T09:29:44Z)
Self-Supervised Learning with Multi-View Rendering for 3D Point Cloud Analysis [33.31864436614945]
We propose a novel pre-training method for 3D point cloud models. Our pre-training is self-supervised by a local pixel/point level correspondence loss and a global image/point cloud level loss. These improved models outperform existing state-of-the-art methods on various datasets and downstream tasks.
arXiv Detail & Related papers (2022-10-28T05:23:03Z)
NAF: Neural Attenuation Fields for Sparse-View CBCT Reconstruction [79.13750275141139]
This paper proposes a novel and fast self-supervised solution for sparse-view CBCT reconstruction. The desired attenuation coefficients are represented as a continuous function of 3D spatial coordinates, parameterized by a fully-connected deep neural network. A learning-based encoder entailing hash coding is adopted to help the network capture high-frequency details.
arXiv Detail & Related papers (2022-09-29T04:06:00Z)
A Novel Neural Network Training Method for Autonomous Driving Using Semi-Pseudo-Labels and 3D Data Augmentations [0.0]
Training neural networks to perform 3D object detection for autonomous driving requires a large amount of diverse annotated data. We have designed a convolutional neural network for 3D object detection which can significantly increase the detection range.
arXiv Detail & Related papers (2022-07-20T13:04:08Z)
AttDLNet: Attention-based DL Network for 3D LiDAR Place Recognition [0.6352264764099531]
This paper proposes a novel 3D LiDAR-based deep learning network named AttDLNet. It exploits an attention mechanism to selectively focus on long-range context and interfeature relationships. Results show that the encoder network features are already very descriptive, but adding attention to the network further improves performance.
arXiv Detail & Related papers (2021-06-17T16:34:37Z)
ST3D: Self-training for Unsupervised Domain Adaptation on 3D ObjectDetection [78.71826145162092]
We present a new domain adaptive self-training pipeline, named ST3D, for unsupervised domain adaptation on 3D object detection from point clouds. Our ST3D achieves state-of-the-art performance on all evaluated datasets and even surpasses fully supervised results on KITTI 3D object detection benchmark.
arXiv Detail & Related papers (2021-03-09T10:51:24Z)
Exemplar Fine-Tuning for 3D Human Model Fitting Towards In-the-Wild 3D Human Pose Estimation [107.07047303858664]
Large-scale human datasets with 3D ground-truth annotations are difficult to obtain in the wild. We address this problem by augmenting existing 2D datasets with high-quality 3D pose fits. The resulting annotations are sufficient to train from scratch 3D pose regressor networks that outperform the current state-of-the-art on in-the-wild benchmarks.
arXiv Detail & Related papers (2020-04-07T20:21:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.