Related papers: Enhancing Poverty Targeting with Spatial Machine Learning: An application to Indonesia

Enhancing Poverty Targeting with Spatial Machine Learning: An application to Indonesia

URL: http://arxiv.org/abs/2503.04300v1
Date: Thu, 06 Mar 2025 10:40:34 GMT
Title: Enhancing Poverty Targeting with Spatial Machine Learning: An application to Indonesia
Authors: Rolando Gonzales Martinez, Mariza Cooray,
Abstract summary: This study uses spatial machine learning to enhance the accuracy of Proxy Means Testing (PMT) for poverty targeting in Indonesia.<n>Using household survey data from the Social Welfare Integrated Data Survey (DTKS) for the periods 2016 to 2020 and 2016 to 2021, this study examines spatial patterns in income distribution and delineates poverty clusters at both provincial and district levels.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This study leverages spatial machine learning (SML) to enhance the accuracy of Proxy Means Testing (PMT) for poverty targeting in Indonesia. Conventional PMT methodologies are prone to exclusion and inclusion errors due to their inability to account for spatial dependencies and regional heterogeneity. By integrating spatial contiguity matrices, SML models mitigate these limitations, facilitating a more precise identification and comparison of geographical poverty clusters. Utilizing household survey data from the Social Welfare Integrated Data Survey (DTKS) for the periods 2016 to 2020 and 2016 to 2021, this study examines spatial patterns in income distribution and delineates poverty clusters at both provincial and district levels. Empirical findings indicate that the proposed SML approach reduces exclusion errors from 28% to 20% compared to standard machine learning models, underscoring the critical role of spatial analysis in refining machine learning-based poverty targeting. These results highlight the potential of SML to inform the design of more equitable and effective social protection policies, particularly in geographically diverse contexts. Future research can explore the applicability of spatiotemporal models and assess the generalizability of SML approaches across varying socio-economic settings.

Related papers

Integrating Score-Based Diffusion Models with Machine Learning-Enhanced Localization for Advanced Data Assimilation in Geological Carbon Storage [35.18016233072556]
This paper explores how machine learning methods can enhance data assimilation for geological carbon storage projects.<n>We employ a machine learning-enhanced localization framework that uses large ensembles with permeabilities generated by the diffusion model.<n>Our approach is applied on a CO$$ injection scenario using the Delft Advanced Research Terra Simulator.
arXiv Detail & Related papers (2025-11-07T14:28:55Z)
Can Large Language Models Integrate Spatial Data? Empirical Insights into Reasoning Strengths and Computational Weaknesses [11.330846631937671]
We explore the application of large language models (LLMs) to empower domain experts in integrating large, heterogeneous, and noisy urban spatial datasets.<n>We show that while LLMs exhibit spatial reasoning capabilities, they struggle to connect the macro-scale environment with the relevant computational geometry tasks.<n>We then adapt a review-and-refine method, which proves remarkably effective in correcting erroneous initial responses while preserving accurate responses.
arXiv Detail & Related papers (2025-08-07T03:44:20Z)
Modelling higher education dropouts using sparse and interpretable post-clustering logistic regression [0.8437187555622164]
Higher education dropout constitutes a critical challenge for tertiary education systems worldwide.<n>The model introduced in this paper is a specialized form of logistic regression, specifically adapted to the context of university dropout analysis.
arXiv Detail & Related papers (2025-05-12T14:05:23Z)
Leveraging ChatGPT's Multimodal Vision Capabilities to Rank Satellite Images by Poverty Level: Advancing Tools for Social Science Research [0.0]
Large Language Models (LLMs) with vision capabilities analyze satellite imagery for village-level poverty prediction.<n>ChatGPT can rank satellite images based on poverty levels with accuracy comparable to domain experts.
arXiv Detail & Related papers (2025-01-24T14:49:00Z)
Stability and Generalization for Distributed SGDA [70.97400503482353]
We propose the stability-based generalization analytical framework for Distributed-SGDA. We conduct a comprehensive analysis of stability error, generalization gap, and population risk across different metrics. Our theoretical results reveal the trade-off between the generalization gap and optimization error.
arXiv Detail & Related papers (2024-11-14T11:16:32Z)
Analyzing Poverty through Intra-Annual Time-Series: A Wavelet Transform Approach [2.3213238782019316]
Using Landsat imagery and nighttime light data, we evaluate EO-ML methods that use intra-annual EO data. Our results indicate that integrating specific NDVI-derived features with multi-spectral data provides valuable insights for poverty analysis.
arXiv Detail & Related papers (2024-11-05T06:59:05Z)
Social Debiasing for Fair Multi-modal LLMs [55.8071045346024]
Multi-modal Large Language Models (MLLMs) have advanced significantly, offering powerful vision-language understanding capabilities. However, these models often inherit severe social biases from their training datasets, leading to unfair predictions based on attributes like race and gender. This paper addresses the issue of social biases in MLLMs by i) Introducing a comprehensive Counterfactual dataset with Multiple Social Concepts (CMSC) and ii) Proposing an Anti-Stereotype Debiasing strategy (ASD)
arXiv Detail & Related papers (2024-08-13T02:08:32Z)
GeoSEE: Regional Socio-Economic Estimation With a Large Language Model [17.31652821477571]
We present GeoSEE, a method that can estimate various socio-economic indicators using a unified pipeline powered by a large language model (LLM) The system then computes target indicators via in-context learning after aggregating results from selected modules in the format of natural language-based texts. Our method outperforms other predictive models in both unsupervised and low-shot contexts.
arXiv Detail & Related papers (2024-06-14T07:50:22Z)
Chain-of-Thought Prompting for Demographic Inference with Large Multimodal Models [58.58594658683919]
Large multimodal models (LMMs) have shown transformative potential across various research tasks. Our findings indicate LMMs possess advantages in zero-shot learning, interpretability, and handling uncurated 'in-the-wild' inputs. We propose a Chain-of-Thought augmented prompting approach, which effectively mitigates the off-target prediction issue.
arXiv Detail & Related papers (2024-05-24T16:26:56Z)
Transfer Learning for Spatial Autoregressive Models with Application to U.S. Presidential Election Prediction [10.825562180226424]
We propose a novel transfer learning framework within the SAR model, called as tranSAR. Our framework enhances estimation and prediction by leveraging information from similar source data. We demonstrate our method's effectiveness in predicting outcomes in U.S. presidential swing states, where it outperforms traditional methods.
arXiv Detail & Related papers (2024-05-20T03:14:15Z)
Leveraging Prompts in LLMs to Overcome Imbalances in Complex Educational Text Data [1.8280573037181356]
We explore the potential of Large Language Models (LLMs) with assertions to mitigate imbalances in educational datasets. This issue is especially prominent in the education sector, where cognitive engagement levels among students show significant variation in their open responses.
arXiv Detail & Related papers (2024-04-28T00:24:08Z)
Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs) We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing. We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z)
Revisiting Deep Semi-supervised Learning: An Empirical Distribution Alignment Framework and Its Generalization Bound [97.93945601881407]
We propose a new deep semi-supervised learning framework called Semi-supervised Learning by Empirical Distribution Alignment (SLEDA) We show the generalization error of semi-supervised learning can be effectively bounded by minimizing the training error on labeled data. Building upon our new framework and the theoretical bound, we develop a simple and effective deep semi-supervised learning method called Augmented Distribution Alignment Network (ADA-Net)
arXiv Detail & Related papers (2022-03-13T11:59:52Z)
Spatial machine-learning model diagnostics: a model-agnostic distance-based approach [91.62936410696409]
This contribution proposes spatial prediction error profiles (SPEPs) and spatial variable importance profiles (SVIPs) as novel model-agnostic assessment and interpretation tools. The SPEPs and SVIPs of geostatistical methods, linear models, random forest, and hybrid algorithms show striking differences and also relevant similarities. The novel diagnostic tools enrich the toolkit of spatial data science, and may improve ML model interpretation, selection, and design.
arXiv Detail & Related papers (2021-11-13T01:50:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.