Simplified Learning of CAD Features Leveraging a Deep Residual
Autoencoder
- URL: http://arxiv.org/abs/2202.10099v1
- Date: Mon, 21 Feb 2022 10:27:55 GMT
- Title: Simplified Learning of CAD Features Leveraging a Deep Residual
Autoencoder
- Authors: Raoul Sch\"onhof and Jannes Elstner and Radu Manea and Steffen Tauber
and Ramez Awad and Marco F. Huber
- Abstract summary: In computer vision, deep residual neural networks like EfficientNet have set new standards in terms of robustness and accuracy.
One key problem underlying the training of deep neural networks is the immanent lack of a sufficient amount of training data.
We present a deep residual 3D autoencoder based on the EfficientNet architecture, intended for transfer learning tasks related to 3D CAD model assessment.
- Score: 3.567248644184455
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In the domain of computer vision, deep residual neural networks like
EfficientNet have set new standards in terms of robustness and accuracy. One
key problem underlying the training of deep neural networks is the immanent
lack of a sufficient amount of training data. The problem worsens especially if
labels cannot be generated automatically, but have to be annotated manually.
This challenge occurs for instance if expert knowledge related to 3D parts
should be externalized based on example models. One way to reduce the necessary
amount of labeled data may be the use of autoencoders, which can be learned in
an unsupervised fashion without labeled data. In this work, we present a deep
residual 3D autoencoder based on the EfficientNet architecture, intended for
transfer learning tasks related to 3D CAD model assessment. For this purpose,
we adopted EfficientNet to 3D problems like voxel models derived from a STEP
file. Striving to reduce the amount of labeled 3D data required, the networks
encoder can be utilized for transfer training.
Related papers
- ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding [51.509115746992165]
We introduce ARKit LabelMaker, the first large-scale, real-world 3D dataset with dense semantic annotations.
We also push forward the state-of-the-art performance on ScanNet and ScanNet200 dataset with prevalent 3D semantic segmentation models.
arXiv Detail & Related papers (2024-10-17T14:44:35Z) - P3P: Pseudo-3D Pre-training for Scaling 3D Masked Autoencoders [32.85484320025852]
We propose a novel self-supervised pre-training framework utilizing the real 3D data and the pseudo-3D data lifted from images by a large depth estimation model.
Our method achieves state-of-the-art performance in 3D classification and few-shot learning while maintaining high pre-training and downstream fine-tuning efficiency.
arXiv Detail & Related papers (2024-08-19T13:59:53Z) - Let Me DeCode You: Decoder Conditioning with Tabular Data [0.15487122608774898]
We introduce a novel approach, DeCode, that utilizes label-derived features for model conditioning to support the decoder in the reconstruction process dynamically.
DeCode focuses on improving 3D segmentation performance through the incorporation of conditioning embedding with learned numerical representation of 3D-label shape features.
Our results show that DeCode significantly outperforms traditional, unconditioned models in terms of generalization to unseen data, achieving higher accuracy at a reduced computational cost.
arXiv Detail & Related papers (2024-07-12T17:14:33Z) - FILP-3D: Enhancing 3D Few-shot Class-incremental Learning with
Pre-trained Vision-Language Models [62.663113296987085]
Few-shot class-incremental learning aims to mitigate the catastrophic forgetting issue when a model is incrementally trained on limited data.
We introduce two novel components: the Redundant Feature Eliminator (RFE) and the Spatial Noise Compensator (SNC)
Considering the imbalance in existing 3D datasets, we also propose new evaluation metrics that offer a more nuanced assessment of a 3D FSCIL model.
arXiv Detail & Related papers (2023-12-28T14:52:07Z) - AutoSynth: Learning to Generate 3D Training Data for Object Point Cloud
Registration [69.21282992341007]
Auto Synth automatically generates 3D training data for point cloud registration.
We replace the point cloud registration network with a much smaller surrogate network, leading to a $4056.43$ speedup.
Our results on TUD-L, LINEMOD and Occluded-LINEMOD evidence that a neural network trained on our searched dataset yields consistently better performance than the same one trained on the widely used ModelNet40 dataset.
arXiv Detail & Related papers (2023-09-20T09:29:44Z) - Self-Supervised Learning with Multi-View Rendering for 3D Point Cloud
Analysis [33.31864436614945]
We propose a novel pre-training method for 3D point cloud models.
Our pre-training is self-supervised by a local pixel/point level correspondence loss and a global image/point cloud level loss.
These improved models outperform existing state-of-the-art methods on various datasets and downstream tasks.
arXiv Detail & Related papers (2022-10-28T05:23:03Z) - NAF: Neural Attenuation Fields for Sparse-View CBCT Reconstruction [79.13750275141139]
This paper proposes a novel and fast self-supervised solution for sparse-view CBCT reconstruction.
The desired attenuation coefficients are represented as a continuous function of 3D spatial coordinates, parameterized by a fully-connected deep neural network.
A learning-based encoder entailing hash coding is adopted to help the network capture high-frequency details.
arXiv Detail & Related papers (2022-09-29T04:06:00Z) - A Novel Neural Network Training Method for Autonomous Driving Using
Semi-Pseudo-Labels and 3D Data Augmentations [0.0]
Training neural networks to perform 3D object detection for autonomous driving requires a large amount of diverse annotated data.
We have designed a convolutional neural network for 3D object detection which can significantly increase the detection range.
arXiv Detail & Related papers (2022-07-20T13:04:08Z) - AttDLNet: Attention-based DL Network for 3D LiDAR Place Recognition [0.6352264764099531]
This paper proposes a novel 3D LiDAR-based deep learning network named AttDLNet.
It exploits an attention mechanism to selectively focus on long-range context and interfeature relationships.
Results show that the encoder network features are already very descriptive, but adding attention to the network further improves performance.
arXiv Detail & Related papers (2021-06-17T16:34:37Z) - ST3D: Self-training for Unsupervised Domain Adaptation on 3D
ObjectDetection [78.71826145162092]
We present a new domain adaptive self-training pipeline, named ST3D, for unsupervised domain adaptation on 3D object detection from point clouds.
Our ST3D achieves state-of-the-art performance on all evaluated datasets and even surpasses fully supervised results on KITTI 3D object detection benchmark.
arXiv Detail & Related papers (2021-03-09T10:51:24Z) - Exemplar Fine-Tuning for 3D Human Model Fitting Towards In-the-Wild 3D
Human Pose Estimation [107.07047303858664]
Large-scale human datasets with 3D ground-truth annotations are difficult to obtain in the wild.
We address this problem by augmenting existing 2D datasets with high-quality 3D pose fits.
The resulting annotations are sufficient to train from scratch 3D pose regressor networks that outperform the current state-of-the-art on in-the-wild benchmarks.
arXiv Detail & Related papers (2020-04-07T20:21:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.