Poly2Vec: Polymorphic Fourier-Based Encoding of Geospatial Objects for GeoAI Applications
- URL: http://arxiv.org/abs/2408.14806v2
- Date: Sun, 11 May 2025 20:07:55 GMT
- Title: Poly2Vec: Polymorphic Fourier-Based Encoding of Geospatial Objects for GeoAI Applications
- Authors: Maria Despoina Siampou, Jialiang Li, John Krumm, Cyrus Shahabi, Hua Lu,
- Abstract summary: Poly2Vec is a polymorphic Fourier-based encoding approach that unifies the representation of geospatial objects.<n>We show that Poly2Vec consistently outperforms object-specific baselines in preserving three key spatial relationships.
- Score: 6.1981153537308336
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Encoding geospatial objects is fundamental for geospatial artificial intelligence (GeoAI) applications, which leverage machine learning (ML) models to analyze spatial information. Common approaches transform each object into known formats, like image and text, for compatibility with ML models. However, this process often discards crucial spatial information, such as the object's position relative to the entire space, reducing downstream task effectiveness. Alternative encoding methods that preserve some spatial properties are often devised for specific data objects (e.g., point encoders), making them unsuitable for tasks that involve different data types (i.e., points, polylines, and polygons). To this end, we propose Poly2Vec, a polymorphic Fourier-based encoding approach that unifies the representation of geospatial objects, while preserving the essential spatial properties. Poly2Vec incorporates a learned fusion module that adaptively integrates the magnitude and phase of the Fourier transform for different tasks and geometries. We evaluate Poly2Vec on five diverse tasks, organized into two categories. The first empirically demonstrates that Poly2Vec consistently outperforms object-specific baselines in preserving three key spatial relationships: topology, direction, and distance. The second shows that integrating Poly2Vec into a state-of-the-art GeoAI workflow improves the performance in two popular tasks: population prediction and land use inference.
Related papers
- Topology-Aware Modeling for Unsupervised Simulation-to-Reality Point Cloud Recognition [63.55828203989405]
We introduce a novel Topology-Aware Modeling (TAM) framework for Sim2Real UDA on object point clouds.<n>Our approach mitigates the domain gap by leveraging global spatial topology, characterized by low-level, high-frequency 3D structures.<n>We propose an advanced self-training strategy that combines cross-domain contrastive learning with self-training.
arXiv Detail & Related papers (2025-06-26T11:53:59Z) - Topo-VM-UNetV2: Encoding Topology into Vision Mamba UNet for Polyp Segmentation [4.856498016044607]
We propose Topo-VMUNetV2, which encodes topological features into the Mamba-based polyp segmentation model, VMUNetV2.<n>Our method consists two stages: VMUNetV2 is used to generate probability maps (PMs) for the training and test images, which are then used to compute topology attention maps.
arXiv Detail & Related papers (2025-05-09T17:41:13Z) - AnySat: One Earth Observation Model for Many Resolutions, Scales, and Modalities [5.767156832161819]
We propose AnySat, a multimodal model based on joint embedding predictive architecture (JEPA) and scale-adaptive spatial encoders.
To demonstrate the advantages of this unified approach, we compile GeoPlex, a collection of $5$ multimodal datasets.
We then train a single powerful model on these diverse datasets simultaneously.
arXiv Detail & Related papers (2024-12-18T18:11:53Z) - GeoFormer: A Multi-Polygon Segmentation Transformer [10.097953939411868]
In remote sensing there exists a common need for learning scale invariant shapes of objects like buildings.
We introduce the GeoFormer, a novel architecture which presents a remedy to the said challenges, learning to generate multipolygons end-to-end.
By modeling keypoints as spatially dependent tokens in an auto-regressive manner, the GeoFormer outperforms existing works in delineating building objects from satellite imagery.
arXiv Detail & Related papers (2024-11-25T17:54:44Z) - Geometry Distributions [51.4061133324376]
We propose a novel geometric data representation that models geometry as distributions.
Our approach uses diffusion models with a novel network architecture to learn surface point distributions.
We evaluate our representation qualitatively and quantitatively across various object types, demonstrating its effectiveness in achieving high geometric fidelity.
arXiv Detail & Related papers (2024-11-25T04:06:48Z) - Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries.
We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images.
Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z) - Sphere2Vec: A General-Purpose Location Representation Learning over a
Spherical Surface for Large-Scale Geospatial Predictions [73.60788465154572]
Current 2D and 3D location encoders are designed to model point distances in Euclidean space.
We propose a multi-scale location encoder called Sphere2Vec which can preserve spherical distances when encoding point coordinates on a spherical surface.
arXiv Detail & Related papers (2023-06-30T12:55:02Z) - VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables.
The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning.
We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z) - Fusing Visual Appearance and Geometry for Multi-modality 6DoF Object
Tracking [21.74515335906769]
We develop a multi-modality tracker that fuses information from visual appearance and geometry to estimate object poses.
The algorithm extends our previous method ICG, which uses geometry, to additionally consider surface appearance.
arXiv Detail & Related papers (2023-02-22T15:53:00Z) - Geometry-Aware Network for Domain Adaptive Semantic Segmentation [64.00345743710653]
We propose a novel Geometry-Aware Network for Domain Adaptation (GANDA) to shrink the domain gaps.
We exploit 3D topology on the point clouds generated from RGB-D images for coordinate-color disentanglement and pseudo-labels refinement in the target domain.
Our model outperforms state-of-the-arts on GTA5->Cityscapes and SYNTHIA->Cityscapes.
arXiv Detail & Related papers (2022-12-02T00:48:44Z) - Towards General-Purpose Representation Learning of Polygonal Geometries [62.34832826705641]
We develop a general-purpose polygon encoding model, which can encode a polygonal geometry into an embedding space.
We conduct experiments on two tasks: 1) shape classification based on MNIST; 2) spatial relation prediction based on two new datasets - DBSR-46K and DBSR-cplx46K.
Our results show that NUFTspec and ResNet1D outperform multiple existing baselines with significant margins.
arXiv Detail & Related papers (2022-09-29T15:59:23Z) - Sphere2Vec: Multi-Scale Representation Learning over a Spherical Surface
for Geospatial Predictions [4.754823920235069]
We propose a multi-scale location encoding model called Sphere2Vec.
It directly encodes point coordinates on a spherical surface while avoiding the mapprojection distortion problem.
We provide theoretical proof that the Sphere2Vec encoding preserves the spherical surface distance between any two points.
arXiv Detail & Related papers (2022-01-25T17:34:29Z) - Spatially Invariant Unsupervised 3D Object Segmentation with Graph
Neural Networks [23.729853358582506]
We propose a framework, SPAIR3D, to model a point cloud as a spatial mixture model.
We jointly learn the multiple-object representation and segmentation in 3D via Variational Autoencoders (VAE)
Experimental results demonstrate that SPAIR3D is capable of detecting and segmenting variable number of objects without appearance information.
arXiv Detail & Related papers (2021-06-10T09:20:16Z) - Spatial Object Recommendation with Hints: When Spatial Granularity
Matters [42.51352610054967]
We study how to support top-k spatial object recommendations at varying levels of spatial granularity.
We propose the use of a POI tree, which captures spatial containment relationships between Point of Interest (POI)
We design a novel multi-task learning model called MPR (short for Multi-level POI Recommendation), where each task aims to return the top-k POIs at a certain spatial granularity level.
arXiv Detail & Related papers (2021-01-08T11:39:51Z) - Cylindrical Convolutional Networks for Joint Object Detection and
Viewpoint Estimation [76.21696417873311]
We introduce a learnable module, cylindrical convolutional networks (CCNs), that exploit cylindrical representation of a convolutional kernel defined in the 3D space.
CCNs extract a view-specific feature through a view-specific convolutional kernel to predict object category scores at each viewpoint.
Our experiments demonstrate the effectiveness of the cylindrical convolutional networks on joint object detection and viewpoint estimation.
arXiv Detail & Related papers (2020-03-25T10:24:58Z) - PUGeo-Net: A Geometry-centric Network for 3D Point Cloud Upsampling [103.09504572409449]
We propose a novel deep neural network based method, called PUGeo-Net, to generate uniform dense point clouds.
Thanks to its geometry-centric nature, PUGeo-Net works well for both CAD models with sharp features and scanned models with rich geometric details.
arXiv Detail & Related papers (2020-02-24T14:13:29Z) - Multi-Scale Representation Learning for Spatial Feature Distributions
using Grid Cells [11.071527762096053]
We propose a representation learning model called Space2Vec to encode the absolute positions and spatial relationships of places.
Results show that because of its multi-scale representations, Space2Vec outperforms well-established ML approaches.
arXiv Detail & Related papers (2020-02-16T04:22:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.