Poly2Vec: Polymorphic Encoding of Geospatial Objects for Spatial Reasoning with Deep Neural Networks
- URL: http://arxiv.org/abs/2408.14806v1
- Date: Tue, 27 Aug 2024 06:28:35 GMT
- Title: Poly2Vec: Polymorphic Encoding of Geospatial Objects for Spatial Reasoning with Deep Neural Networks
- Authors: Maria Despoina Siampou, Jialiang Li, John Krumm, Cyrus Shahabi, Hua Lu,
- Abstract summary: Poly2Vec is an encoding framework that unifies the modeling of different geospatial objects.
We leverage the power of the 2D Fourier transform to encode useful spatial properties, such as shape and location.
This unified approach eliminates the need to develop and train separate models for each distinct spatial type.
- Score: 6.1981153537308336
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Encoding geospatial data is crucial for enabling machine learning (ML) models to perform tasks that require spatial reasoning, such as identifying the topological relationships between two different geospatial objects. However, existing encoding methods are limited as they are typically customized to handle only specific types of spatial data, which impedes their applicability across different downstream tasks where multiple data types coexist. To address this, we introduce Poly2Vec, an encoding framework that unifies the modeling of different geospatial objects, including 2D points, polylines, and polygons, irrespective of the downstream task. We leverage the power of the 2D Fourier transform to encode useful spatial properties, such as shape and location, from geospatial objects into fixed-length vectors. These vectors are then inputted into neural network models for spatial reasoning tasks.This unified approach eliminates the need to develop and train separate models for each distinct spatial type. We evaluate Poly2Vec on both synthetic and real datasets of mixed geometry types and verify its consistent performance across several downstream spatial reasoning tasks.
Related papers
- GeoFormer: A Multi-Polygon Segmentation Transformer [10.097953939411868]
In remote sensing there exists a common need for learning scale invariant shapes of objects like buildings.
We introduce the GeoFormer, a novel architecture which presents a remedy to the said challenges, learning to generate multipolygons end-to-end.
By modeling keypoints as spatially dependent tokens in an auto-regressive manner, the GeoFormer outperforms existing works in delineating building objects from satellite imagery.
arXiv Detail & Related papers (2024-11-25T17:54:44Z) - Geometry Distributions [51.4061133324376]
We propose a novel geometric data representation that models geometry as distributions.
Our approach uses diffusion models with a novel network architecture to learn surface point distributions.
We evaluate our representation qualitatively and quantitatively across various object types, demonstrating its effectiveness in achieving high geometric fidelity.
arXiv Detail & Related papers (2024-11-25T04:06:48Z) - Sphere2Vec: A General-Purpose Location Representation Learning over a
Spherical Surface for Large-Scale Geospatial Predictions [73.60788465154572]
Current 2D and 3D location encoders are designed to model point distances in Euclidean space.
We propose a multi-scale location encoder called Sphere2Vec which can preserve spherical distances when encoding point coordinates on a spherical surface.
arXiv Detail & Related papers (2023-06-30T12:55:02Z) - VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables.
The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning.
We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z) - Geometry-Aware Network for Domain Adaptive Semantic Segmentation [64.00345743710653]
We propose a novel Geometry-Aware Network for Domain Adaptation (GANDA) to shrink the domain gaps.
We exploit 3D topology on the point clouds generated from RGB-D images for coordinate-color disentanglement and pseudo-labels refinement in the target domain.
Our model outperforms state-of-the-arts on GTA5->Cityscapes and SYNTHIA->Cityscapes.
arXiv Detail & Related papers (2022-12-02T00:48:44Z) - Towards General-Purpose Representation Learning of Polygonal Geometries [62.34832826705641]
We develop a general-purpose polygon encoding model, which can encode a polygonal geometry into an embedding space.
We conduct experiments on two tasks: 1) shape classification based on MNIST; 2) spatial relation prediction based on two new datasets - DBSR-46K and DBSR-cplx46K.
Our results show that NUFTspec and ResNet1D outperform multiple existing baselines with significant margins.
arXiv Detail & Related papers (2022-09-29T15:59:23Z) - Sphere2Vec: Multi-Scale Representation Learning over a Spherical Surface
for Geospatial Predictions [4.754823920235069]
We propose a multi-scale location encoding model called Sphere2Vec.
It directly encodes point coordinates on a spherical surface while avoiding the mapprojection distortion problem.
We provide theoretical proof that the Sphere2Vec encoding preserves the spherical surface distance between any two points.
arXiv Detail & Related papers (2022-01-25T17:34:29Z) - Spatially Invariant Unsupervised 3D Object Segmentation with Graph
Neural Networks [23.729853358582506]
We propose a framework, SPAIR3D, to model a point cloud as a spatial mixture model.
We jointly learn the multiple-object representation and segmentation in 3D via Variational Autoencoders (VAE)
Experimental results demonstrate that SPAIR3D is capable of detecting and segmenting variable number of objects without appearance information.
arXiv Detail & Related papers (2021-06-10T09:20:16Z) - Cylindrical Convolutional Networks for Joint Object Detection and
Viewpoint Estimation [76.21696417873311]
We introduce a learnable module, cylindrical convolutional networks (CCNs), that exploit cylindrical representation of a convolutional kernel defined in the 3D space.
CCNs extract a view-specific feature through a view-specific convolutional kernel to predict object category scores at each viewpoint.
Our experiments demonstrate the effectiveness of the cylindrical convolutional networks on joint object detection and viewpoint estimation.
arXiv Detail & Related papers (2020-03-25T10:24:58Z) - Multi-Scale Representation Learning for Spatial Feature Distributions
using Grid Cells [11.071527762096053]
We propose a representation learning model called Space2Vec to encode the absolute positions and spatial relationships of places.
Results show that because of its multi-scale representations, Space2Vec outperforms well-established ML approaches.
arXiv Detail & Related papers (2020-02-16T04:22:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.