Multi-Scale Representation Learning for Spatial Feature Distributions
using Grid Cells
- URL: http://arxiv.org/abs/2003.00824v1
- Date: Sun, 16 Feb 2020 04:22:18 GMT
- Title: Multi-Scale Representation Learning for Spatial Feature Distributions
using Grid Cells
- Authors: Gengchen Mai, Krzysztof Janowicz, Bo Yan, Rui Zhu, Ling Cai, Ni Lao
- Abstract summary: We propose a representation learning model called Space2Vec to encode the absolute positions and spatial relationships of places.
Results show that because of its multi-scale representations, Space2Vec outperforms well-established ML approaches.
- Score: 11.071527762096053
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised text encoding models have recently fueled substantial progress
in NLP. The key idea is to use neural networks to convert words in texts to
vector space representations based on word positions in a sentence and their
contexts, which are suitable for end-to-end training of downstream tasks. We
see a strikingly similar situation in spatial analysis, which focuses on
incorporating both absolute positions and spatial contexts of geographic
objects such as POIs into models. A general-purpose representation model for
space is valuable for a multitude of tasks. However, no such general model
exists to date beyond simply applying discretization or feed-forward nets to
coordinates, and little effort has been put into jointly modeling distributions
with vastly different characteristics, which commonly emerges from GIS data.
Meanwhile, Nobel Prize-winning Neuroscience research shows that grid cells in
mammals provide a multi-scale periodic representation that functions as a
metric for location encoding and is critical for recognizing places and for
path-integration. Therefore, we propose a representation learning model called
Space2Vec to encode the absolute positions and spatial relationships of places.
We conduct experiments on two real-world geographic data for two different
tasks: 1) predicting types of POIs given their positions and context, 2) image
classification leveraging their geo-locations. Results show that because of its
multi-scale representations, Space2Vec outperforms well-established ML
approaches such as RBF kernels, multi-layer feed-forward nets, and tile
embedding approaches for location modeling and image classification tasks.
Detailed analysis shows that all baselines can at most well handle distribution
at one scale but show poor performances in other scales. In contrast,
Space2Vec's multi-scale representation can handle distributions at different
scales.
Related papers
- TorchSpatial: A Location Encoding Framework and Benchmark for Spatial Representation Learning [36.725822223732635]
We propose TorchSpatial, a learning framework and benchmark for location (point) encoding.
TorchSpatial contains three key components: 1) a unified location encoding framework that consolidates 15 commonly recognized location encoders; 2) the LocBench benchmark tasks encompassing 7 geo-aware image classification and 4 geo-aware image regression datasets; and 3) a comprehensive suite of evaluation metrics to quantify geo-aware models' overall performance as well as their geographic bias, with a novel Geo-Bias Score metric.
arXiv Detail & Related papers (2024-06-21T21:33:16Z) - Instance-free Text to Point Cloud Localization with Relative Position Awareness [37.22900045434484]
Text-to-point-cloud cross-modal localization is an emerging vision-language task critical for future robot-human collaboration.
We address two key limitations of existing approaches: 1) their reliance on ground-truth instances as input; and 2) their neglect of the relative positions among potential instances.
Our proposed model follows a two-stage pipeline, including a coarse stage for text-cell retrieval and a fine stage for position estimation.
arXiv Detail & Related papers (2024-04-27T09:46:49Z) - Learning Implicit Feature Alignment Function for Semantic Segmentation [51.36809814890326]
Implicit Feature Alignment function (IFA) is inspired by the rapidly expanding topic of implicit neural representations.
We show that IFA implicitly aligns the feature maps at different levels and is capable of producing segmentation maps in arbitrary resolutions.
Our method can be combined with improvement on various architectures, and it achieves state-of-the-art accuracy trade-off on common benchmarks.
arXiv Detail & Related papers (2022-06-17T09:40:14Z) - AF$_2$: Adaptive Focus Framework for Aerial Imagery Segmentation [86.44683367028914]
Aerial imagery segmentation has some unique challenges, the most critical one among which lies in foreground-background imbalance.
We propose Adaptive Focus Framework (AF$), which adopts a hierarchical segmentation procedure and focuses on adaptively utilizing multi-scale representations.
AF$ has significantly improved the accuracy on three widely used aerial benchmarks, as fast as the mainstream method.
arXiv Detail & Related papers (2022-02-18T10:14:45Z) - Learning to Aggregate Multi-Scale Context for Instance Segmentation in
Remote Sensing Images [28.560068780733342]
A novel context aggregation network (CATNet) is proposed to improve the feature extraction process.
The proposed model exploits three lightweight plug-and-play modules, namely dense feature pyramid network (DenseFPN), spatial context pyramid ( SCP), and hierarchical region of interest extractor (HRoIE)
arXiv Detail & Related papers (2021-11-22T08:55:25Z) - Positional Encoder Graph Neural Networks for Geographic Data [1.840220263320992]
Graph neural networks (GNNs) provide a powerful and scalable solution for modeling continuous spatial data.
In this paper, we propose PE-GNN, a new framework that incorporates spatial context and correlation explicitly into the models.
arXiv Detail & Related papers (2021-11-19T10:41:49Z) - HSVA: Hierarchical Semantic-Visual Adaptation for Zero-Shot Learning [74.76431541169342]
Zero-shot learning (ZSL) tackles the unseen class recognition problem, transferring semantic knowledge from seen classes to unseen ones.
We propose a novel hierarchical semantic-visual adaptation (HSVA) framework to align semantic and visual domains.
Experiments on four benchmark datasets demonstrate HSVA achieves superior performance on both conventional and generalized ZSL.
arXiv Detail & Related papers (2021-09-30T14:27:50Z) - Spatial-spectral Hyperspectral Image Classification via Multiple Random
Anchor Graphs Ensemble Learning [88.60285937702304]
This paper proposes a novel spatial-spectral HSI classification method via multiple random anchor graphs ensemble learning (RAGE)
Firstly, the local binary pattern is adopted to extract the more descriptive features on each selected band, which preserves local structures and subtle changes of a region.
Secondly, the adaptive neighbors assignment is introduced in the construction of anchor graph, to reduce the computational complexity.
arXiv Detail & Related papers (2021-03-25T09:31:41Z) - Region Similarity Representation Learning [94.88055458257081]
Region Similarity Representation Learning (ReSim) is a new approach to self-supervised representation learning for localization-based tasks.
ReSim learns both regional representations for localization as well as semantic image-level representations.
We show how ReSim learns representations which significantly improve the localization and classification performance compared to a competitive MoCo-v2 baseline.
arXiv Detail & Related papers (2021-03-24T00:42:37Z) - Learning Large-scale Location Embedding From Human Mobility Trajectories
with Graphs [0.0]
This study learns vector representations for locations using the large-scale LBS data.
This model embeds context information in human mobility and spatial information.
GCN-L2V can be applied in a complementary manner to other place embedding methods and down-streaming Geo-aware applications.
arXiv Detail & Related papers (2021-02-23T09:11:33Z) - PGL: Prior-Guided Local Self-supervised Learning for 3D Medical Image
Segmentation [87.50205728818601]
We propose a PriorGuided Local (PGL) self-supervised model that learns the region-wise local consistency in the latent feature space.
Our PGL model learns the distinctive representations of local regions, and hence is able to retain structural information.
arXiv Detail & Related papers (2020-11-25T11:03:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.