Semi-Supervised Adversarial Recognition of Refined Window Structures for
Inverse Procedural Fa\c{c}ade Modeling
- URL: http://arxiv.org/abs/2201.08977v1
- Date: Sat, 22 Jan 2022 06:34:48 GMT
- Title: Semi-Supervised Adversarial Recognition of Refined Window Structures for
Inverse Procedural Fa\c{c}ade Modeling
- Authors: Han Hu, Xinrong Liang, Yulin Ding, Qisen Shang, Bo Xu, Xuming Ge, Min
Chen, Ruofei Zhong, Qing Zhu
- Abstract summary: This paper proposes a semi-supervised adversarial recognition strategy embedded in inverse procedural modeling.
A simple procedural engine is built inside an existing 3D modeling software, producing fine-grained window geometries.
Experiments using publicly available faccade image datasets reveal that the proposed training strategy can obtain about 10% improvement in classification accuracy.
- Score: 17.62526990262815
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning methods are notoriously data-hungry, which requires a large
number of labeled samples. Unfortunately, the large amount of interactive
sample labeling efforts has dramatically hindered the application of deep
learning methods, especially for 3D modeling tasks, which require heterogeneous
samples. To alleviate the work of data annotation for learned 3D modeling of
fa\c{c}ades, this paper proposed a semi-supervised adversarial recognition
strategy embedded in inverse procedural modeling. Beginning with textured LOD-2
(Level-of-Details) models, we use the classical convolutional neural networks
to recognize the types and estimate the parameters of windows from image
patches. The window types and parameters are then assembled into procedural
grammar. A simple procedural engine is built inside an existing 3D modeling
software, producing fine-grained window geometries. To obtain a useful model
from a few labeled samples, we leverage the generative adversarial network to
train the feature extractor in a semi-supervised manner. The adversarial
training strategy can also exploit unlabeled data to make the training phase
more stable. Experiments using publicly available fa\c{c}ade image datasets
reveal that the proposed training strategy can obtain about 10% improvement in
classification accuracy and 50% improvement in parameter estimation under the
same network structure. In addition, performance gains are more pronounced when
testing against unseen data featuring different fa\c{c}ade styles.
Related papers
- Masked Generative Extractor for Synergistic Representation and 3D Generation of Point Clouds [6.69660410213287]
We propose an innovative framework called Point-MGE to explore the benefits of deeply integrating 3D representation learning and generative learning.
In shape classification, Point-MGE achieved an accuracy of 94.2% (+1.0%) on the ModelNet40 dataset and 92.9% (+5.5%) on the ScanObjectNN dataset.
Experimental results also confirmed that Point-MGE can generate high-quality 3D shapes in both unconditional and conditional settings.
arXiv Detail & Related papers (2024-06-25T07:57:03Z) - Pre-Trained Vision-Language Models as Partial Annotators [40.89255396643592]
Pre-trained vision-language models learn massive data to model unified representations of images and natural languages.
In this paper, we investigate a novel "pre-trained annotating - weakly-supervised learning" paradigm for pre-trained model application and experiment on image classification tasks.
arXiv Detail & Related papers (2024-05-23T17:17:27Z) - MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments [72.6405488990753]
Self-supervised learning can be used for mitigating the greedy needs of Vision Transformer networks.
We propose a single-stage and standalone method, MOCA, which unifies both desired properties.
We achieve new state-of-the-art results on low-shot settings and strong experimental results in various evaluation protocols.
arXiv Detail & Related papers (2023-07-18T15:46:20Z) - Randomized 3D Scene Generation for Generalizable Self-Supervised
Pre-Training [0.0]
We propose a new method to generate 3D scenes with spherical harmonics.
It surpasses the previous formula-driven method with a clear margin and achieves on-par results with methods using real-world scans and CAD models.
arXiv Detail & Related papers (2023-06-07T08:28:38Z) - Boosting Low-Data Instance Segmentation by Unsupervised Pre-training
with Saliency Prompt [103.58323875748427]
This work offers a novel unsupervised pre-training solution for low-data regimes.
Inspired by the recent success of the Prompting technique, we introduce a new pre-training method that boosts QEIS models.
Experimental results show that our method significantly boosts several QEIS models on three datasets.
arXiv Detail & Related papers (2023-02-02T15:49:03Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - Semi-Supervised Single-View 3D Reconstruction via Prototype Shape Priors [79.80916315953374]
We propose SSP3D, a semi-supervised framework for 3D reconstruction.
We introduce an attention-guided prototype shape prior module for guiding realistic object reconstruction.
Our approach also performs well when transferring to real-world Pix3D datasets under labeling ratios of 10%.
arXiv Detail & Related papers (2022-09-30T11:19:25Z) - Scene Synthesis via Uncertainty-Driven Attribute Synchronization [52.31834816911887]
This paper introduces a novel neural scene synthesis approach that can capture diverse feature patterns of 3D scenes.
Our method combines the strength of both neural network-based and conventional scene synthesis approaches.
arXiv Detail & Related papers (2021-08-30T19:45:07Z) - Mitigating Generation Shifts for Generalized Zero-Shot Learning [52.98182124310114]
Generalized Zero-Shot Learning (GZSL) is the task of leveraging semantic information (e.g., attributes) to recognize the seen and unseen samples, where unseen classes are not observable during training.
We propose a novel Generation Shifts Mitigating Flow framework for learning unseen data synthesis efficiently and effectively.
Experimental results demonstrate that GSMFlow achieves state-of-the-art recognition performance in both conventional and generalized zero-shot settings.
arXiv Detail & Related papers (2021-07-07T11:43:59Z) - Learning Compositional Shape Priors for Few-Shot 3D Reconstruction [36.40776735291117]
We show that complex encoder-decoder architectures exploit large amounts of per-category data.
We propose three ways to learn a class-specific global shape prior, directly from data.
Experiments on the popular ShapeNet dataset show that our method outperforms a zero-shot baseline by over 40%.
arXiv Detail & Related papers (2021-06-11T14:55:49Z) - Point Transformer for Shape Classification and Retrieval of 3D and ALS
Roof PointClouds [3.3744638598036123]
This paper proposes a fully attentional model - em Point Transformer, for deriving a rich point cloud representation.
The model's shape classification and retrieval performance are evaluated on a large-scale urban dataset - RoofN3D and a standard benchmark dataset ModelNet40.
The proposed method outperforms other state-of-the-art models in the RoofN3D dataset, gives competitive results in the ModelNet40 benchmark, and showcases high robustness to various unseen point corruptions.
arXiv Detail & Related papers (2020-11-08T08:11:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.