PopSim: An Individual-level Population Simulator for Equitable
Allocation of City Resources
- URL: http://arxiv.org/abs/2305.02204v1
- Date: Tue, 25 Apr 2023 23:43:21 GMT
- Title: PopSim: An Individual-level Population Simulator for Equitable
Allocation of City Resources
- Authors: Khanh Duy Nguyen, Nima Shahbazi and Abolfazl Asudeh
- Abstract summary: We introduce PopSim, a system for generating semi-synthetic individual-level population data with demographic information.
We use PopSim to generate multiple benchmark datasets for the city of Chicago and conduct extensive statistical evaluations.
- Score: 12.152728063703005
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Historical systematic exclusionary tactics based on race have forced people
of certain demographic groups to congregate in specific urban areas. Aside from
the ethical aspects of such segregation, these policies have implications for
the allocation of urban resources including public transportation, healthcare,
and education within the cities. The initial step towards addressing these
issues involves conducting an audit to assess the status of equitable resource
allocation. However, due to privacy and confidentiality concerns,
individual-level data containing demographic information cannot be made
publicly available. By leveraging publicly available aggregated demographic
statistics data, we introduce PopSim, a system for generating semi-synthetic
individual-level population data with demographic information. We use PopSim to
generate multiple benchmark datasets for the city of Chicago and conduct
extensive statistical evaluations to validate those. We further use our
datasets for several case studies that showcase the application of our system
for auditing equitable allocation of city resources.
Related papers
- Data-Centric AI in the Age of Large Language Models [51.20451986068925]
This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs)
We make the key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential stages (e.g., in-context learning) of LLMs.
We identify four specific scenarios centered around data, covering data-centric benchmarks and data curation, data attribution, knowledge transfer, and inference contextualization.
arXiv Detail & Related papers (2024-06-20T16:34:07Z) - Synthpop++: A Hybrid Framework for Generating A Country-scale Synthetic Population [0.680303951699936]
Population censuses are costly, time-consuming, and may also raise privacy concerns.
We introduce SynthPop++, which can combine data from multiple real-world surveys to produce a real-scale synthetic population.
Our experimental results show that synthetic population can realistically simulate the population for various administrative units of India.
arXiv Detail & Related papers (2023-04-24T17:27:56Z) - A deep learning framework to generate realistic population and mobility
data [5.180648702293017]
Census and Household Travel Survey datasets are regularly collected from households and individuals.
These datasets often represent a limited sample of the population due to privacy concerns or are given aggregated.
We propose a framework to generate a synthetic population that includes both socioeconomic features (e.g., age, sex, industry) and trip chains (i.e., activity locations)
arXiv Detail & Related papers (2022-11-14T14:05:09Z) - Releasing survey microdata with exact cluster locations and additional
privacy safeguards [77.34726150561087]
We propose an alternative microdata dissemination strategy that leverages the utility of the original microdata with additional privacy safeguards.
Our strategy reduces the respondents' re-identification risk for any number of disclosed attributes by 60-80% even under re-identification attempts.
arXiv Detail & Related papers (2022-05-24T19:37:11Z) - So2Sat POP -- A Curated Benchmark Data Set for Population Estimation
from Space on a Continental Scale [11.38584315242023]
We provide a comprehensive data set for population estimation in 98 European cities.
The data set comprises a digital elevation model, local climate zone, land use proportions, nighttime lights in combination with multi-spectral Sentinel-2 imagery, and data from the Open Street Map initiative.
arXiv Detail & Related papers (2022-04-07T07:30:43Z) - Census-Independent Population Estimation using Representation Learning [0.5735035463793007]
Census-independent population estimation approaches using alternative data sources have shown promise in providing frequent and reliable population estimates locally.
We explore recent representation learning approaches, and assess the transferability of representations to population estimation in Mozambique.
Using representation learning reduces required human supervision, since features are extracted automatically.
We compare the resulting population estimates to existing population products from GRID3, Facebook (HRSL) and WorldPop.
arXiv Detail & Related papers (2021-10-06T15:13:36Z) - Methodological Foundation of a Numerical Taxonomy of Urban Form [62.997667081978825]
We present a method for numerical taxonomy of urban form derived from biological systematics.
We derive homogeneous urban tissue types and, by determining overall morphological similarity between them, generate a hierarchical classification of urban form.
After framing and presenting the method, we test it on two cities - Prague and Amsterdam.
arXiv Detail & Related papers (2021-04-30T12:47:52Z) - Leveraging Administrative Data for Bias Audits: Assessing Disparate
Coverage with Mobility Data for COVID-19 Policy [61.60099467888073]
We show how linking administrative data can enable auditing mobility data for bias.
We show that older and non-white voters are less likely to be captured by mobility data.
We show that allocating public health resources based on such mobility data could disproportionately harm high-risk elderly and minority groups.
arXiv Detail & Related papers (2020-11-14T02:04:14Z) - Magnify Your Population: Statistical Downscaling to Augment the Spatial
Resolution of Socioeconomic Census Data [48.7576911714538]
We present a new statistical downscaling approach to derive fine-scale estimates of key socioeconomic attributes.
For each selected socioeconomic variable, a Random Forest model is trained on the source Census units and then used to generate fine-scale gridded predictions.
As a case study, we apply this method to Census data in the United States, downscaling the selected socioeconomic variables available at the block group level, to a grid of 300 spatial resolution.
arXiv Detail & Related papers (2020-06-23T16:52:18Z) - Measuring Spatial Subdivisions in Urban Mobility with Mobile Phone Data [58.720142291102135]
By 2050 two thirds of the world population will reside in urban areas.
This growth is faster and more complex than the ability of cities to measure and plan for their sustainability.
To understand what makes a city inclusive for all, we define a methodology to identify and characterize spatial subdivisions.
arXiv Detail & Related papers (2020-02-20T14:37:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.