Related papers: Multi-Scale Representation Learning for Spatial Feature Distributions using Grid Cells

Multi-Scale Representation Learning for Spatial Feature Distributions using Grid Cells

URL: http://arxiv.org/abs/2003.00824v1
Date: Sun, 16 Feb 2020 04:22:18 GMT
Title: Multi-Scale Representation Learning for Spatial Feature Distributions using Grid Cells
Authors: Gengchen Mai, Krzysztof Janowicz, Bo Yan, Rui Zhu, Ling Cai, Ni Lao
Abstract summary: We propose a representation learning model called Space2Vec to encode the absolute positions and spatial relationships of places. Results show that because of its multi-scale representations, Space2Vec outperforms well-established ML approaches.
Score: 11.071527762096053
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Unsupervised text encoding models have recently fueled substantial progress in NLP. The key idea is to use neural networks to convert words in texts to vector space representations based on word positions in a sentence and their contexts, which are suitable for end-to-end training of downstream tasks. We see a strikingly similar situation in spatial analysis, which focuses on incorporating both absolute positions and spatial contexts of geographic objects such as POIs into models. A general-purpose representation model for space is valuable for a multitude of tasks. However, no such general model exists to date beyond simply applying discretization or feed-forward nets to coordinates, and little effort has been put into jointly modeling distributions with vastly different characteristics, which commonly emerges from GIS data. Meanwhile, Nobel Prize-winning Neuroscience research shows that grid cells in mammals provide a multi-scale periodic representation that functions as a metric for location encoding and is critical for recognizing places and for path-integration. Therefore, we propose a representation learning model called Space2Vec to encode the absolute positions and spatial relationships of places. We conduct experiments on two real-world geographic data for two different tasks: 1) predicting types of POIs given their positions and context, 2) image classification leveraging their geo-locations. Results show that because of its multi-scale representations, Space2Vec outperforms well-established ML approaches such as RBF kernels, multi-layer feed-forward nets, and tile embedding approaches for location modeling and image classification tasks. Detailed analysis shows that all baselines can at most well handle distribution at one scale but show poor performances in other scales. In contrast, Space2Vec's multi-scale representation can handle distributions at different scales.

Related papers

An Interpretable Implicit-Based Approach for Modeling Local Spatial Effects: A Case Study of Global Gross Primary Productivity [9.352810748734157]
In Earth sciences, unobserved factors exhibit non-stationary distributions, causing the relationships between features and targets to display spatial heterogeneity. In geographic machine learning tasks, conventional statistical learning methods often struggle to capture spatial heterogeneity. We propose a novel perspective - that is, simultaneously modeling common features across different locations alongside spatial differences using deep neural networks.
arXiv Detail & Related papers (2025-02-10T05:44:54Z)
TorchSpatial: A Location Encoding Framework and Benchmark for Spatial Representation Learning [36.725822223732635]
We propose TorchSpatial, a learning framework and benchmark for location (point) encoding. TorchSpatial contains three key components: 1) a unified location encoding framework that consolidates 15 commonly recognized location encoders; 2) the LocBench benchmark tasks encompassing 7 geo-aware image classification and 4 geo-aware image regression datasets; and 3) a comprehensive suite of evaluation metrics to quantify geo-aware models' overall performance as well as their geographic bias, with a novel Geo-Bias Score metric.
arXiv Detail & Related papers (2024-06-21T21:33:16Z)
Instance-free Text to Point Cloud Localization with Relative Position Awareness [37.22900045434484]
Text-to-point-cloud cross-modal localization is an emerging vision-language task critical for future robot-human collaboration. We address two key limitations of existing approaches: 1) their reliance on ground-truth instances as input; and 2) their neglect of the relative positions among potential instances. Our proposed model follows a two-stage pipeline, including a coarse stage for text-cell retrieval and a fine stage for position estimation.
arXiv Detail & Related papers (2024-04-27T09:46:49Z)
Learning Implicit Feature Alignment Function for Semantic Segmentation [51.36809814890326]
Implicit Feature Alignment function (IFA) is inspired by the rapidly expanding topic of implicit neural representations. We show that IFA implicitly aligns the feature maps at different levels and is capable of producing segmentation maps in arbitrary resolutions. Our method can be combined with improvement on various architectures, and it achieves state-of-the-art accuracy trade-off on common benchmarks.
arXiv Detail & Related papers (2022-06-17T09:40:14Z)
AF$_2$: Adaptive Focus Framework for Aerial Imagery Segmentation [86.44683367028914]
Aerial imagery segmentation has some unique challenges, the most critical one among which lies in foreground-background imbalance. We propose Adaptive Focus Framework (AF$), which adopts a hierarchical segmentation procedure and focuses on adaptively utilizing multi-scale representations. AF$ has significantly improved the accuracy on three widely used aerial benchmarks, as fast as the mainstream method.
arXiv Detail & Related papers (2022-02-18T10:14:45Z)
Learning to Aggregate Multi-Scale Context for Instance Segmentation in Remote Sensing Images [28.560068780733342]
A novel context aggregation network (CATNet) is proposed to improve the feature extraction process. The proposed model exploits three lightweight plug-and-play modules, namely dense feature pyramid network (DenseFPN), spatial context pyramid ( SCP), and hierarchical region of interest extractor (HRoIE)
arXiv Detail & Related papers (2021-11-22T08:55:25Z)
Positional Encoder Graph Neural Networks for Geographic Data [1.840220263320992]
Graph neural networks (GNNs) provide a powerful and scalable solution for modeling continuous spatial data. In this paper, we propose PE-GNN, a new framework that incorporates spatial context and correlation explicitly into the models.
arXiv Detail & Related papers (2021-11-19T10:41:49Z)
HSVA: Hierarchical Semantic-Visual Adaptation for Zero-Shot Learning [74.76431541169342]
Zero-shot learning (ZSL) tackles the unseen class recognition problem, transferring semantic knowledge from seen classes to unseen ones. We propose a novel hierarchical semantic-visual adaptation (HSVA) framework to align semantic and visual domains. Experiments on four benchmark datasets demonstrate HSVA achieves superior performance on both conventional and generalized ZSL.
arXiv Detail & Related papers (2021-09-30T14:27:50Z)
Spatial-spectral Hyperspectral Image Classification via Multiple Random Anchor Graphs Ensemble Learning [88.60285937702304]
This paper proposes a novel spatial-spectral HSI classification method via multiple random anchor graphs ensemble learning (RAGE) Firstly, the local binary pattern is adopted to extract the more descriptive features on each selected band, which preserves local structures and subtle changes of a region. Secondly, the adaptive neighbors assignment is introduced in the construction of anchor graph, to reduce the computational complexity.
arXiv Detail & Related papers (2021-03-25T09:31:41Z)
Region Similarity Representation Learning [94.88055458257081]
Region Similarity Representation Learning (ReSim) is a new approach to self-supervised representation learning for localization-based tasks. ReSim learns both regional representations for localization as well as semantic image-level representations. We show how ReSim learns representations which significantly improve the localization and classification performance compared to a competitive MoCo-v2 baseline.
arXiv Detail & Related papers (2021-03-24T00:42:37Z)
Learning Large-scale Location Embedding From Human Mobility Trajectories with Graphs [0.0]
This study learns vector representations for locations using the large-scale LBS data. This model embeds context information in human mobility and spatial information. GCN-L2V can be applied in a complementary manner to other place embedding methods and down-streaming Geo-aware applications.
arXiv Detail & Related papers (2021-02-23T09:11:33Z)
PGL: Prior-Guided Local Self-supervised Learning for 3D Medical Image Segmentation [87.50205728818601]
We propose a PriorGuided Local (PGL) self-supervised model that learns the region-wise local consistency in the latent feature space. Our PGL model learns the distinctive representations of local regions, and hence is able to retain structural information.
arXiv Detail & Related papers (2020-11-25T11:03:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.