Population synthesis with geographic coordinates
- URL: http://arxiv.org/abs/2510.09669v1
- Date: Wed, 08 Oct 2025 13:36:13 GMT
- Title: Population synthesis with geographic coordinates
- Authors: Jacopo Lenti, Lorenzo Costantini, Ariadna Fosch, Anna Monticelli, David Scala, Marco Pangallo,
- Abstract summary: It is increasingly important to generate synthetic populations with explicit coordinates rather than coarse geographic areas.<n>We propose a population synthesis algorithm that maps spatial coordinates into a more regular latent space.<n>We demonstrate the method by generating synthetic homes with the same statistical properties of real homes in 121 datasets.
- Score: 1.6419687521433917
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: It is increasingly important to generate synthetic populations with explicit coordinates rather than coarse geographic areas, yet no established methods exist to achieve this. One reason is that latitude and longitude differ from other continuous variables, exhibiting large empty spaces and highly uneven densities. To address this, we propose a population synthesis algorithm that first maps spatial coordinates into a more regular latent space using Normalizing Flows (NF), and then combines them with other features in a Variational Autoencoder (VAE) to generate synthetic populations. This approach also learns the joint distribution between spatial and non-spatial features, exploiting spatial autocorrelations. We demonstrate the method by generating synthetic homes with the same statistical properties of real homes in 121 datasets, corresponding to diverse geographies. We further propose an evaluation framework that measures both spatial accuracy and practical utility, while ensuring privacy preservation. Our results show that the NF+VAE architecture outperforms popular benchmarks, including copula-based methods and uniform allocation within geographic areas. The ability to generate geolocated synthetic populations at fine spatial resolution opens the door to applications requiring detailed geography, from household responses to floods, to epidemic spread, evacuation planning, and transport modeling.
Related papers
- GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics [91.17301794848025]
This paper presents GeoAgent, a model capable of reasoning closely with humans and deriving fine-grained address conclusions.<n>Previous RL-based methods have achieved breakthroughs in performance and interpretability but still remain concerns because of their reliance on AI-generated chain-of-thought (CoT) data and training strategies.
arXiv Detail & Related papers (2026-02-13T04:48:05Z) - Geographically Weighted Canonical Correlation Analysis: Local Spatial Associations Between Two Sets of Variables [47.652697094546994]
This article critically assesses the utility of the classical statistical technique of Canonical Correlation Analysis (CCA) for studying spatial associations.<n>We propose Geographically Weighted Canonical Correlation Analysis (GWCCA) as a new technique for exploring local spatial associations between two sets of variables.<n>The results indicate that GWCCA has broad potential applications in spatial data-intensive fields such as urban planning, environmental science, public health, and transportation.
arXiv Detail & Related papers (2026-02-10T19:36:49Z) - A multi-view contrastive learning framework for spatial embeddings in risk modelling [0.688204255655161]
spatial data are often unstructured, high-dimensional, and difficult to integrate into predictive models.<n>We propose a novel multi-view contrastive learning framework for generating spatial embeddings.<n>In a case study on French real estate prices, we compare models trained on raw coordinates against those using our spatial embeddings as inputs.
arXiv Detail & Related papers (2025-11-22T07:39:34Z) - Neighbor-aware informal settlement mapping with graph convolutional networks [1.226598527858578]
We propose a graph-based framework that incorporates local geographical context into the classification process.<n>Experiments are conducted on a case study in Rio de Janeiro using spatial cross-validation.<n>Our method outperforms standard baselines, improving Kappa coefficient by 17 points over individual cell classification.
arXiv Detail & Related papers (2025-09-30T12:25:25Z) - Spatial Knowledge Graph-Guided Multimodal Synthesis [78.11669780958657]
We introduce a novel multimodal synthesis approach guided by spatial knowledge graphs, grounded in the concept of knowledge-to-data generation.<n>In experiments, data synthesized from diverse types of spatial knowledge, including direction and distance, enhance the spatial perception and reasoning abilities of MLLMs markedly.<n>We hope that the idea of knowledge-based data synthesis can advance the development of spatial intelligence.
arXiv Detail & Related papers (2025-05-28T17:50:21Z) - Can LLMs Learn to Map the World from Local Descriptions? [50.490593949836146]
This study investigates whether Large Language Models (LLMs) can construct coherent global spatial cognition.<n> Experiments conducted in a simulated urban environment demonstrate that LLMs exhibit latent representations aligned with real-world spatial distributions.
arXiv Detail & Related papers (2025-05-27T08:22:58Z) - An Interpretable Implicit-Based Approach for Modeling Local Spatial Effects: A Case Study of Global Gross Primary Productivity [9.352810748734157]
In Earth sciences, unobserved factors exhibit non-stationary distributions, causing the relationships between features and targets to display spatial heterogeneity.<n>In geographic machine learning tasks, conventional statistical learning methods often struggle to capture spatial heterogeneity.<n>We propose a novel perspective - that is, simultaneously modeling common features across different locations alongside spatial differences using deep neural networks.
arXiv Detail & Related papers (2025-02-10T05:44:54Z) - RegionGCN: Spatial-Heterogeneity-Aware Graph Convolutional Networks [8.132751508556078]
We propose to model spatial process heterogeneity at the regional level rather than at the individual level.<n>Our proposed spatial-heterogeneity-aware graph convolutional network, named RegionGCN, is applied to the spatial prediction of county-level vote share in the 2016 US presidential election.
arXiv Detail & Related papers (2025-01-29T12:09:01Z) - Space-aware Socioeconomic Indicator Inference with Heterogeneous Graphs [46.52719756897067]
We present GeoHG, the first space-aware socioeconomic indicator inference method that utilizes a heterogeneous graph-based structure to represent geospace for non-continuous inference.
arXiv Detail & Related papers (2024-05-23T03:19:02Z) - Rethinking Sensors Modeling: Hierarchical Information Enhanced Traffic
Forecasting [47.1051445072085]
We argue to rethink the sensor's dependency modeling from two hierarchies: regional and global.
We generate representative and common-temporal patterns as global nodes to reflect a global dependency between sensors.
In pursuit of the generality of reality of node representations, we incorporate a Meta GCN to propagate and global nodes in the physical data space.
arXiv Detail & Related papers (2023-09-20T13:08:34Z) - Multi-Temporal Relationship Inference in Urban Areas [75.86026742632528]
Finding temporal relationships among locations can benefit a bunch of urban applications, such as dynamic offline advertising and smart public transport planning.
We propose a solution to Trial with a graph learning scheme, which includes a spatially evolving graph neural network (SEENet)
SEConv performs the intra-time aggregation and inter-time propagation to capture the multifaceted spatially evolving contexts from the view of location message passing.
SE-SSL designs time-aware self-supervised learning tasks in a global-local manner with additional evolving constraint to enhance the location representation learning and further handle the relationship sparsity.
arXiv Detail & Related papers (2023-06-15T07:48:32Z) - SARN: Structurally-Aware Recurrent Network for Spatio-Temporal Disaggregation [8.636014676778682]
Open data is frequently released spatially aggregated, usually to comply with privacy policies. But coarse, heterogeneous aggregations complicate coherent learning and integration for downstream AI/ML systems.
We propose an overarching model named Structurally-Aware Recurrent Network (SARN), which integrates structurally-aware spatial attention layers into the Gated Recurrent Unit (GRU) model.
For scenarios with limited historical training data, we show that a model pre-trained on one city variable can be fine-tuned for another city variable using only a few hundred samples.
arXiv Detail & Related papers (2023-06-09T21:01:29Z) - Robust Self-Tuning Data Association for Geo-Referencing Using Lane Markings [44.4879068879732]
This paper presents a complete pipeline for resolving ambiguities during the data association.
Its core is a robust self-tuning data association that adapts the search area depending on the entropy of the measurements.
We evaluate our method on real data from urban and rural scenarios around the city of Karlsruhe in Germany.
arXiv Detail & Related papers (2022-07-28T12:29:39Z) - Methodological Foundation of a Numerical Taxonomy of Urban Form [62.997667081978825]
We present a method for numerical taxonomy of urban form derived from biological systematics.
We derive homogeneous urban tissue types and, by determining overall morphological similarity between them, generate a hierarchical classification of urban form.
After framing and presenting the method, we test it on two cities - Prague and Amsterdam.
arXiv Detail & Related papers (2021-04-30T12:47:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.