HEAT: Holistic Edge Attention Transformer for Structured Reconstruction
- URL: http://arxiv.org/abs/2111.15143v1
- Date: Tue, 30 Nov 2021 06:01:11 GMT
- Title: HEAT: Holistic Edge Attention Transformer for Structured Reconstruction
- Authors: Jiacheng Chen, Yiming Qian, Yasutaka Furukawa
- Abstract summary: This paper presents a novel attention-based neural network for structured reconstruction.
It takes a 2D image as an input and reconstructs a planar graph depicting an underlying geometric structure.
The approach detects corners and classifies edge candidates between corners in an end-to-end manner.
- Score: 36.910604284201355
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a novel attention-based neural network for structured
reconstruction, which takes a 2D raster image as an input and reconstructs a
planar graph depicting an underlying geometric structure. The approach detects
corners and classifies edge candidates between corners in an end-to-end manner.
Our contribution is a holistic edge classification architecture, which 1)
initializes the feature of an edge candidate by a trigonometric positional
encoding of its end-points; 2) fuses image feature to each edge candidate by
deformable attention; 3) employs two weight-sharing Transformer decoders to
learn holistic structural patterns over the graph edge candidates; and 4) is
trained with a masked learning strategy. The corner detector is a variant of
the edge classification architecture, adapted to operate on pixels as corner
candidates. We conduct experiments on two structured reconstruction tasks:
outdoor building architecture and indoor floorplan planar graph reconstruction.
Extensive qualitative and quantitative evaluations demonstrate the superiority
of our approach over the state of the art. We will share code and models.
Related papers
- Enhancing Polygonal Building Segmentation via Oriented Corners [0.3749861135832072]
This paper introduces a novel deep convolutional neural network named OriCornerNet, which directly extracts delineated building polygons from input images.
Our approach involves a deep model that predicts building footprint masks, corners, and orientation vectors that indicate directions toward adjacent corners.
Performance evaluations conducted on SpaceNet Vegas and CrowdAI-small datasets demonstrate the competitive efficacy of our approach.
arXiv Detail & Related papers (2024-07-17T01:59:06Z) - DeepBranchTracer: A Generally-Applicable Approach to Curvilinear
Structure Reconstruction Using Multi-Feature Learning [12.047523258256088]
We introduce DeepBranchTracer, a novel method that learns both external image features and internal geometric characteristics to reconstruct curvilinear structures.
We extensively evaluated our model on both 2D and 3D datasets, demonstrating its superior performance over existing segmentation and reconstruction methods.
arXiv Detail & Related papers (2024-02-02T07:13:07Z) - CornerFormer: Boosting Corner Representation for Fine-Grained Structured
Reconstruction [20.04081992616026]
We present an enhanced corner representation method for structured reconstruction.
It better reconstructs fine-grained structures, such as adjacent corners and tiny edges.
It outperforms the state-of-the-art model by +1.9%@F-1 on Corner and +3.0%@F-1 on Edge.
arXiv Detail & Related papers (2023-04-14T11:51:26Z) - Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud
Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology.
Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z) - Hierarchical Graph Networks for 3D Human Pose Estimation [50.600944798627786]
Recent 2D-to-3D human pose estimation works tend to utilize the graph structure formed by the topology of the human skeleton.
We argue that this skeletal topology is too sparse to reflect the body structure and suffer from serious 2D-to-3D ambiguity problem.
We propose a novel graph convolution network architecture, Hierarchical Graph Networks, to overcome these weaknesses.
arXiv Detail & Related papers (2021-11-23T15:09:03Z) - Building-GAN: Graph-Conditioned Architectural Volumetric Design
Generation [10.024367148266721]
This paper focuses on volumetric design generation conditioned on an input program graph.
Instead of outputting dense 3D voxels, we propose a new 3D representation named voxel graph that is both compact and expressive for building geometries.
Our generator is a cross-modal graph neural network that uses a pointer mechanism to connect the input program graph and the output voxel graph, and the whole pipeline is trained using the adversarial framework.
arXiv Detail & Related papers (2021-04-27T16:49:34Z) - Self-supervised Geometric Perception [96.89966337518854]
Self-supervised geometric perception is a framework to learn a feature descriptor for correspondence matching without any ground-truth geometric model labels.
We show that SGP achieves state-of-the-art performance that is on-par or superior to the supervised oracles trained using ground-truth labels.
arXiv Detail & Related papers (2021-03-04T15:34:43Z) - Learning Geometry-Disentangled Representation for Complementary
Understanding of 3D Object Point Cloud [50.56461318879761]
We propose Geometry-Disentangled Attention Network (GDANet) for 3D image processing.
GDANet disentangles point clouds into contour and flat part of 3D objects, respectively denoted by sharp and gentle variation components.
Experiments on 3D object classification and segmentation benchmarks demonstrate that GDANet achieves the state-of-the-arts with fewer parameters.
arXiv Detail & Related papers (2020-12-20T13:35:00Z) - Roof-GAN: Learning to Generate Roof Geometry and Relations for
Residential Houses [37.6686237027665]
Roof-GAN is a novel generative adversarial network that generates structured geometry of residential roof structures as a set of roof primitives and their relationships.
The generator produces a structured roof model as a graph, which consists of 1) primitive geometry as images at each node, 2) inter-primitive colinear/coplanar relationships at each edge, and 3) primitive geometry in a vector format at each node.
The discriminator is trained to assess the primitive geometry, the primitive relationships, and the primitive vector geometry in a fully end-to-end architecture.
arXiv Detail & Related papers (2020-12-17T00:47:57Z) - Neural Subdivision [58.97214948753937]
This paper introduces Neural Subdivision, a novel framework for data-driven coarseto-fine geometry modeling.
We optimize for the same set of network weights across all local mesh patches, thus providing an architecture that is not constrained to a specific input mesh, fixed genus, or category.
We demonstrate that even when trained on a single high-resolution mesh our method generates reasonable subdivisions for novel shapes.
arXiv Detail & Related papers (2020-05-04T20:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.