Learning Neighborhood Representation from Multi-Modal Multi-Graph:
Image, Text, Mobility Graph and Beyond
- URL: http://arxiv.org/abs/2105.02489v1
- Date: Thu, 6 May 2021 07:44:05 GMT
- Title: Learning Neighborhood Representation from Multi-Modal Multi-Graph:
Image, Text, Mobility Graph and Beyond
- Authors: Tianyuan Huang, Zhecheng Wang, Hao Sheng, Andrew Y. Ng, Ram Rajagopal
- Abstract summary: We propose a novel approach to integrate multi-modal geotagged inputs as either node or edge features of a multi-graph.
Specifically, we use street view images and POI features to characterize neighborhoods (nodes) and use human mobility to characterize the relationship between neighborhoods (directed edges)
The embedding we trained outperforms the ones using only unimodal data as regional inputs.
- Score: 20.014906526266795
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent urbanization has coincided with the enrichment of geotagged data, such
as street view and point-of-interest (POI). Region embedding enhanced by the
richer data modalities has enabled researchers and city administrators to
understand the built environment, socioeconomics, and the dynamics of cities
better. While some efforts have been made to simultaneously use multi-modal
inputs, existing methods can be improved by incorporating different measures of
'proximity' in the same embedding space - leveraging not only the data that
characterizes the regions (e.g., street view, local businesses pattern) but
also those that depict the relationship between regions (e.g., trips, road
network). To this end, we propose a novel approach to integrate multi-modal
geotagged inputs as either node or edge features of a multi-graph based on
their relations with the neighborhood region (e.g., tiles, census block, ZIP
code region, etc.). We then learn the neighborhood representation based on a
contrastive-sampling scheme from the multi-graph. Specifically, we use street
view images and POI features to characterize neighborhoods (nodes) and use
human mobility to characterize the relationship between neighborhoods (directed
edges). We show the effectiveness of the proposed methods with quantitative
downstream tasks as well as qualitative analysis of the embedding space: The
embedding we trained outperforms the ones using only unimodal data as regional
inputs.
Related papers
- Urban Region Pre-training and Prompting: A Graph-based Approach [10.375941950028938]
We propose a $textbfG$raph-based $textbfU$rban $textbfR$egion $textbfP$re-training and $textbfP$rompting framework for region representation learning.
arXiv Detail & Related papers (2024-08-12T05:00:23Z) - Enhanced Urban Region Profiling with Adversarial Contrastive Learning [7.62909500335772]
EUPAC is a novel framework that enhances the robustness of urban region embeddings.
Our model generates region embeddings that preserve intra-region and inter-region dependencies.
Experiments on real-world datasets demonstrate the superiority of our model over state-of-the-art methods.
arXiv Detail & Related papers (2024-02-02T06:06:45Z) - Attentive Graph Enhanced Region Representation Learning [7.4106801792345705]
Representing urban regions accurately and comprehensively is essential for various urban planning and analysis tasks.
We propose the Attentive Graph Enhanced Region Representation Learning (ATGRL) model, which aims to capture comprehensive dependencies from multiple graphs and learn rich semantic representations of urban regions.
arXiv Detail & Related papers (2023-07-06T16:38:43Z) - Multi-Temporal Relationship Inference in Urban Areas [75.86026742632528]
Finding temporal relationships among locations can benefit a bunch of urban applications, such as dynamic offline advertising and smart public transport planning.
We propose a solution to Trial with a graph learning scheme, which includes a spatially evolving graph neural network (SEENet)
SEConv performs the intra-time aggregation and inter-time propagation to capture the multifaceted spatially evolving contexts from the view of location message passing.
SE-SSL designs time-aware self-supervised learning tasks in a global-local manner with additional evolving constraint to enhance the location representation learning and further handle the relationship sparsity.
arXiv Detail & Related papers (2023-06-15T07:48:32Z) - R-MAE: Regions Meet Masked Autoencoders [113.73147144125385]
We explore regions as a potential visual analogue of words for self-supervised image representation learning.
Inspired by Masked Autoencoding (MAE), a generative pre-training baseline, we propose masked region autoencoding to learn from groups of pixels or regions.
arXiv Detail & Related papers (2023-06-08T17:56:46Z) - Urban Region Profiling via A Multi-Graph Representation Learning
Framework [0.0]
We propose a multi-graph representative learning framework, called Region2Vec, for urban region profiling.
Experiments on real-world datasets show that Region2Vec can be employed in three applications and outperforms all state-of-the-art baselines.
arXiv Detail & Related papers (2022-02-04T11:05:37Z) - Methodological Foundation of a Numerical Taxonomy of Urban Form [62.997667081978825]
We present a method for numerical taxonomy of urban form derived from biological systematics.
We derive homogeneous urban tissue types and, by determining overall morphological similarity between them, generate a hierarchical classification of urban form.
After framing and presenting the method, we test it on two cities - Prague and Amsterdam.
arXiv Detail & Related papers (2021-04-30T12:47:52Z) - Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization [54.00111565818903]
Cross-view geo-localization is to spot images of the same geographic target from different platforms.
Existing methods usually concentrate on mining the fine-grained feature of the geographic target in the image center.
We introduce a simple and effective deep neural network, called Local Pattern Network (LPN), to take advantage of contextual information.
arXiv Detail & Related papers (2020-08-26T16:06:11Z) - Cross-Domain Facial Expression Recognition: A Unified Evaluation
Benchmark and Adversarial Graph Learning [85.6386289476598]
We develop a novel adversarial graph representation adaptation (AGRA) framework for cross-domain holistic-local feature co-adaptation.
We conduct extensive and fair evaluations on several popular benchmarks and show that the proposed AGRA framework outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2020-08-03T15:00:31Z) - Adversarial Graph Representation Adaptation for Cross-Domain Facial
Expression Recognition [86.25926461936412]
We propose a novel Adrialversa Graph Representation Adaptation (AGRA) framework that unifies graph representation propagation with adversarial learning for cross-domain holistic-local feature co-adaptation.
We conduct extensive and fair experiments on several popular benchmarks and show that the proposed AGRA framework achieves superior performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2020-08-03T13:27:24Z) - Urban2Vec: Incorporating Street View Imagery and POIs for Multi-Modal
Urban Neighborhood Embedding [8.396746290518102]
Urban2Vec is an unsupervised multi-modal framework which incorporates both street view imagery and point-of-interest data.
We show that Urban2Vec can achieve performances better than baseline models and comparable to fully-supervised methods in downstream prediction tasks.
arXiv Detail & Related papers (2020-01-29T21:30:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.