Related papers: Poly2Vec: Polymorphic Encoding of Geospatial Objects for Spatial Reasoning with Deep Neural Networks

Poly2Vec: Polymorphic Encoding of Geospatial Objects for Spatial Reasoning with Deep Neural Networks

URL: http://arxiv.org/abs/2408.14806v1
Date: Tue, 27 Aug 2024 06:28:35 GMT
Title: Poly2Vec: Polymorphic Encoding of Geospatial Objects for Spatial Reasoning with Deep Neural Networks
Authors: Maria Despoina Siampou, Jialiang Li, John Krumm, Cyrus Shahabi, Hua Lu,
Abstract summary: Poly2Vec is an encoding framework that unifies the modeling of different geospatial objects. We leverage the power of the 2D Fourier transform to encode useful spatial properties, such as shape and location. This unified approach eliminates the need to develop and train separate models for each distinct spatial type.
Score: 6.1981153537308336
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Encoding geospatial data is crucial for enabling machine learning (ML) models to perform tasks that require spatial reasoning, such as identifying the topological relationships between two different geospatial objects. However, existing encoding methods are limited as they are typically customized to handle only specific types of spatial data, which impedes their applicability across different downstream tasks where multiple data types coexist. To address this, we introduce Poly2Vec, an encoding framework that unifies the modeling of different geospatial objects, including 2D points, polylines, and polygons, irrespective of the downstream task. We leverage the power of the 2D Fourier transform to encode useful spatial properties, such as shape and location, from geospatial objects into fixed-length vectors. These vectors are then inputted into neural network models for spatial reasoning tasks.This unified approach eliminates the need to develop and train separate models for each distinct spatial type. We evaluate Poly2Vec on both synthetic and real datasets of mixed geometry types and verify its consistent performance across several downstream spatial reasoning tasks.

Related papers

Topology-Aware Modeling for Unsupervised Simulation-to-Reality Point Cloud Recognition [63.55828203989405]
We introduce a novel Topology-Aware Modeling (TAM) framework for Sim2Real UDA on object point clouds.<n>Our approach mitigates the domain gap by leveraging global spatial topology, characterized by low-level, high-frequency 3D structures.<n>We propose an advanced self-training strategy that combines cross-domain contrastive learning with self-training.
arXiv Detail & Related papers (2025-06-26T11:53:59Z)
Topo-VM-UNetV2: Encoding Topology into Vision Mamba UNet for Polyp Segmentation [4.856498016044607]
We propose Topo-VMUNetV2, which encodes topological features into the Mamba-based polyp segmentation model, VMUNetV2.<n>Our method consists two stages: VMUNetV2 is used to generate probability maps (PMs) for the training and test images, which are then used to compute topology attention maps.
arXiv Detail & Related papers (2025-05-09T17:41:13Z)
AnySat: One Earth Observation Model for Many Resolutions, Scales, and Modalities [5.767156832161819]
We propose AnySat, a multimodal model based on joint embedding predictive architecture (JEPA) and scale-adaptive spatial encoders. To demonstrate the advantages of this unified approach, we compile GeoPlex, a collection of $5$ multimodal datasets. We then train a single powerful model on these diverse datasets simultaneously.
arXiv Detail & Related papers (2024-12-18T18:11:53Z)
GeoFormer: A Multi-Polygon Segmentation Transformer [10.097953939411868]
In remote sensing there exists a common need for learning scale invariant shapes of objects like buildings. We introduce the GeoFormer, a novel architecture which presents a remedy to the said challenges, learning to generate multipolygons end-to-end. By modeling keypoints as spatially dependent tokens in an auto-regressive manner, the GeoFormer outperforms existing works in delineating building objects from satellite imagery.
arXiv Detail & Related papers (2024-11-25T17:54:44Z)
Geometry Distributions [51.4061133324376]
We propose a novel geometric data representation that models geometry as distributions. Our approach uses diffusion models with a novel network architecture to learn surface point distributions. We evaluate our representation qualitatively and quantitatively across various object types, demonstrating its effectiveness in achieving high geometric fidelity.
arXiv Detail & Related papers (2024-11-25T04:06:48Z)
Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries. We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images. Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z)
Sphere2Vec: A General-Purpose Location Representation Learning over a Spherical Surface for Large-Scale Geospatial Predictions [73.60788465154572]
Current 2D and 3D location encoders are designed to model point distances in Euclidean space. We propose a multi-scale location encoder called Sphere2Vec which can preserve spherical distances when encoding point coordinates on a spherical surface.
arXiv Detail & Related papers (2023-06-30T12:55:02Z)
VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables. The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning. We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z)
Fusing Visual Appearance and Geometry for Multi-modality 6DoF Object Tracking [21.74515335906769]
We develop a multi-modality tracker that fuses information from visual appearance and geometry to estimate object poses. The algorithm extends our previous method ICG, which uses geometry, to additionally consider surface appearance.
arXiv Detail & Related papers (2023-02-22T15:53:00Z)
Geometry-Aware Network for Domain Adaptive Semantic Segmentation [64.00345743710653]
We propose a novel Geometry-Aware Network for Domain Adaptation (GANDA) to shrink the domain gaps. We exploit 3D topology on the point clouds generated from RGB-D images for coordinate-color disentanglement and pseudo-labels refinement in the target domain. Our model outperforms state-of-the-arts on GTA5->Cityscapes and SYNTHIA->Cityscapes.
arXiv Detail & Related papers (2022-12-02T00:48:44Z)
Towards General-Purpose Representation Learning of Polygonal Geometries [62.34832826705641]
We develop a general-purpose polygon encoding model, which can encode a polygonal geometry into an embedding space. We conduct experiments on two tasks: 1) shape classification based on MNIST; 2) spatial relation prediction based on two new datasets - DBSR-46K and DBSR-cplx46K. Our results show that NUFTspec and ResNet1D outperform multiple existing baselines with significant margins.
arXiv Detail & Related papers (2022-09-29T15:59:23Z)
Sphere2Vec: Multi-Scale Representation Learning over a Spherical Surface for Geospatial Predictions [4.754823920235069]
We propose a multi-scale location encoding model called Sphere2Vec. It directly encodes point coordinates on a spherical surface while avoiding the mapprojection distortion problem. We provide theoretical proof that the Sphere2Vec encoding preserves the spherical surface distance between any two points.
arXiv Detail & Related papers (2022-01-25T17:34:29Z)
Spatially Invariant Unsupervised 3D Object Segmentation with Graph Neural Networks [23.729853358582506]
We propose a framework, SPAIR3D, to model a point cloud as a spatial mixture model. We jointly learn the multiple-object representation and segmentation in 3D via Variational Autoencoders (VAE) Experimental results demonstrate that SPAIR3D is capable of detecting and segmenting variable number of objects without appearance information.
arXiv Detail & Related papers (2021-06-10T09:20:16Z)
Spatial Object Recommendation with Hints: When Spatial Granularity Matters [42.51352610054967]
We study how to support top-k spatial object recommendations at varying levels of spatial granularity. We propose the use of a POI tree, which captures spatial containment relationships between Point of Interest (POI) We design a novel multi-task learning model called MPR (short for Multi-level POI Recommendation), where each task aims to return the top-k POIs at a certain spatial granularity level.
arXiv Detail & Related papers (2021-01-08T11:39:51Z)
Cylindrical Convolutional Networks for Joint Object Detection and Viewpoint Estimation [76.21696417873311]
We introduce a learnable module, cylindrical convolutional networks (CCNs), that exploit cylindrical representation of a convolutional kernel defined in the 3D space. CCNs extract a view-specific feature through a view-specific convolutional kernel to predict object category scores at each viewpoint. Our experiments demonstrate the effectiveness of the cylindrical convolutional networks on joint object detection and viewpoint estimation.
arXiv Detail & Related papers (2020-03-25T10:24:58Z)
PUGeo-Net: A Geometry-centric Network for 3D Point Cloud Upsampling [103.09504572409449]
We propose a novel deep neural network based method, called PUGeo-Net, to generate uniform dense point clouds. Thanks to its geometry-centric nature, PUGeo-Net works well for both CAD models with sharp features and scanned models with rich geometric details.
arXiv Detail & Related papers (2020-02-24T14:13:29Z)
Multi-Scale Representation Learning for Spatial Feature Distributions using Grid Cells [11.071527762096053]
We propose a representation learning model called Space2Vec to encode the absolute positions and spatial relationships of places. Results show that because of its multi-scale representations, Space2Vec outperforms well-established ML approaches.
arXiv Detail & Related papers (2020-02-16T04:22:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.