Magnify Your Population: Statistical Downscaling to Augment the Spatial
Resolution of Socioeconomic Census Data
- URL: http://arxiv.org/abs/2006.13152v1
- Date: Tue, 23 Jun 2020 16:52:18 GMT
- Title: Magnify Your Population: Statistical Downscaling to Augment the Spatial
Resolution of Socioeconomic Census Data
- Authors: Giulia Carella, Andy Eschbacher, Dongjie Fan, Miguel \'Alvarez,
\'Alvaro Arredondo, Alejandro Polvillo Hall, Javier P\'erez Trufero, and
Javier de la Torre
- Abstract summary: We present a new statistical downscaling approach to derive fine-scale estimates of key socioeconomic attributes.
For each selected socioeconomic variable, a Random Forest model is trained on the source Census units and then used to generate fine-scale gridded predictions.
As a case study, we apply this method to Census data in the United States, downscaling the selected socioeconomic variables available at the block group level, to a grid of 300 spatial resolution.
- Score: 48.7576911714538
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fine resolution estimates of demographic and socioeconomic attributes are
crucial for planning and policy development. While several efforts have been
made to produce fine-scale gridded population estimates, socioeconomic features
are typically not available at scales finer than Census units, which may hide
local heterogeneity and disparity. In this paper we present a new statistical
downscaling approach to derive fine-scale estimates of key socioeconomic
attributes. The method leverages demographic and geographical extensive
covariates available at multiple scales and additional Census covariates only
available at coarse resolution, which are included in the model hierarchically
within a "forward learning" approach. For each selected socioeconomic variable,
a Random Forest model is trained on the source Census units and then used to
generate fine-scale gridded predictions, which are then adjusted to ensure the
best possible consistency with the coarser Census data. As a case study, we
apply this method to Census data in the United States, downscaling the selected
socioeconomic variables available at the block group level, to a grid of ~300
spatial resolution. The accuracy of the method is assessed at both spatial
scales, first computing a pseudo cross-validation coefficient of determination
for the predictions at the block group level and then, for extensive variables
only, also for the (unadjusted) predicted counts summed by block group. Based
on these scores and on the inspection of the downscaled maps, we conclude that
our method is able to provide accurate, smoother, and more detailed
socioeconomic estimates than the available Census data.
Related papers
- A Deep Generative Framework for Joint Households and Individuals Population Synthesis [0.562479170374811]
We propose a deep generative framework to generate a synthetic population with household-individual and individual-individual relationships.
Results for an application in Delaware, USA demonstrate the ability to ensure the realism of generated household-individual records.
arXiv Detail & Related papers (2024-06-30T23:01:58Z) - A step towards the integration of machine learning and small area
estimation [0.0]
We propose a predictor supported by machine learning algorithms which can be used to predict any population or subpopulation characteristics.
We study only small departures from the assumed model, to show that our proposal is a good alternative in this case as well.
What is more, we propose the method of the accuracy estimation of machine learning predictors, giving the possibility of the accuracy comparison with classic methods.
arXiv Detail & Related papers (2024-02-12T09:43:17Z) - Fine-Grained Socioeconomic Prediction from Satellite Images with
Distributional Adjustment [14.076490368696508]
We propose a method that assigns a socioeconomic score to each satellite image by capturing the distributional behavior observed in larger areas.
We train an ordinal regression scoring model and adjust the scores to follow the common power law within and across regions.
Our method also demonstrates robust performance in districts with uneven development, suggesting its potential use in developing countries.
arXiv Detail & Related papers (2023-08-30T12:06:04Z) - Small Area Estimation with Random Forests and the LASSO [39.58317527488534]
This work is motivated by Ghanaian data available from the sixth Living Standard Survey (GLSS) and the 2010 Population and Housing Census.
We compare areal-level random forests and LASSO approaches to a frequentist forward variable selection approach and a Bayesian shrinkage method.
We find substantial between-area variation, the log consumption areal point estimates showing a 1.3-fold variation across the GAMA region.
arXiv Detail & Related papers (2023-08-29T10:02:10Z) - Learning from Heterogeneous Data Based on Social Interactions over
Graphs [58.34060409467834]
This work proposes a decentralized architecture, where individual agents aim at solving a classification problem while observing streaming features of different dimensions.
We show that the.
strategy enables the agents to learn consistently under this highly-heterogeneous setting.
We show that the.
strategy enables the agents to learn consistently under this highly-heterogeneous setting.
arXiv Detail & Related papers (2021-12-17T12:47:18Z) - Selecting the suitable resampling strategy for imbalanced data
classification regarding dataset properties [62.997667081978825]
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class.
This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples.
Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class.
arXiv Detail & Related papers (2021-12-15T18:56:39Z) - Predicting Census Survey Response Rates With Parsimonious Additive
Models and Structured Interactions [14.003044924094597]
We consider the problem of predicting survey response rates using a family of flexible and interpretable nonparametric models.
The study is motivated by the US Census Bureau's well-known ROAM application.
arXiv Detail & Related papers (2021-08-24T17:49:55Z) - Uncertainty Estimation and Sample Selection for Crowd Counting [87.29137075538213]
We present a method for image-based crowd counting that can predict a crowd density map together with the uncertainty values pertaining to the predicted density map.
A key advantage of our method over existing crowd counting methods is its ability to quantify the uncertainty of its predictions.
We show that our sample selection strategy drastically reduces the amount of labeled data needed to adapt a counting network trained on a source domain to the target domain.
arXiv Detail & Related papers (2020-09-30T03:40:07Z) - Differential Privacy of Hierarchical Census Data: An Optimization
Approach [53.29035917495491]
Census Bureaus are interested in releasing aggregate socio-economic data about a large population without revealing sensitive information about any individual.
Recent events have identified some of the privacy challenges faced by these organizations.
This paper presents a novel differential-privacy mechanism for releasing hierarchical counts of individuals.
arXiv Detail & Related papers (2020-06-28T18:19:55Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.