Urban Incident Prediction with Graph Neural Networks: Integrating Government Ratings and Crowdsourced Reports
- URL: http://arxiv.org/abs/2506.08740v1
- Date: Tue, 10 Jun 2025 12:37:17 GMT
- Title: Urban Incident Prediction with Graph Neural Networks: Integrating Government Ratings and Crowdsourced Reports
- Authors: Sidhika Balachandar, Shuvom Sadhuka, Bonnie Berger, Emma Pierson, Nikhil Garg,
- Abstract summary: We propose a multiview, multioutput GNN-based model that uses both unbiased rating data and biased reporting data to predict the true latent state of incidents.<n>We show on both real and semi-synthetic data that our model can better predict the latent state compared to models that use only rating data.
- Score: 6.4910613074559445
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph neural networks (GNNs) are widely used in urban spatiotemporal forecasting, such as predicting infrastructure problems. In this setting, government officials wish to know in which neighborhoods incidents like potholes or rodent issues occur. The true state of incidents (e.g., street conditions) for each neighborhood is observed via government inspection ratings. However, these ratings are only conducted for a sparse set of neighborhoods and incident types. We also observe the state of incidents via crowdsourced reports, which are more densely observed but may be biased due to heterogeneous reporting behavior. First, for such settings, we propose a multiview, multioutput GNN-based model that uses both unbiased rating data and biased reporting data to predict the true latent state of incidents. Second, we investigate a case study of New York City urban incidents and collect, standardize, and make publicly available a dataset of 9,615,863 crowdsourced reports and 1,041,415 government inspection ratings over 3 years and across 139 types of incidents. Finally, we show on both real and semi-synthetic data that our model can better predict the latent state compared to models that use only reporting data or models that use only rating data, especially when rating data is sparse and reports are predictive of ratings. We also quantify demographic biases in crowdsourced reporting, e.g., higher-income neighborhoods report problems at higher rates. Our analysis showcases a widely applicable approach for latent state prediction using heterogeneous, sparse, and biased data.
Related papers
- Graph-Based Prediction Models for Data Debiasing [6.221408085892461]
Bias in data collection, arising from both under-reporting and over-reporting, poses significant challenges in healthcare and public safety.<n>We introduce Graph-based Over- and Under-reporting Debiasing (GROUD), a novel graph-based optimization framework that debiases reported data by jointly estimating the true incident counts and the associated reporting bias probabilities.<n>We validate GROUD on both challenging simulated experiments and real-world datasets, including Atlanta emergency calls and COVID-19 vaccine adverse event reports.
arXiv Detail & Related papers (2025-04-12T21:34:49Z) - VLBiasBench: A Comprehensive Benchmark for Evaluating Bias in Large Vision-Language Model [72.13121434085116]
We introduce VLBiasBench, a benchmark to evaluate biases in Large Vision-Language Models (LVLMs)<n>VLBiasBench features a dataset that covers nine distinct categories of social biases, including age, disability status, gender, nationality, physical appearance, race, religion, profession, social economic status, as well as two intersectional bias categories: race x gender and race x social economic status.<n>We conduct extensive evaluations on 15 open-source models as well as two advanced closed-source models, yielding new insights into the biases present in these models.
arXiv Detail & Related papers (2024-06-20T10:56:59Z) - Graph Out-of-Distribution Generalization via Causal Intervention [69.70137479660113]
We introduce a conceptually simple yet principled approach for training robust graph neural networks (GNNs) under node-level distribution shifts.
Our method resorts to a new learning objective derived from causal inference that coordinates an environment estimator and a mixture-of-expert GNN predictor.
Our model can effectively enhance generalization with various types of distribution shifts and yield up to 27.4% accuracy improvement over state-of-the-arts on graph OOD generalization benchmarks.
arXiv Detail & Related papers (2024-02-18T07:49:22Z) - BiasBuster: a Neural Approach for Accurate Estimation of Population
Statistics using Biased Location Data [6.077198822448429]
We show that statistical debiasing, although in some cases useful, often fails to improve accuracy.
We then propose BiasBuster, a neural network approach that utilizes the correlations between population statistics and location characteristics to provide accurate estimates of population statistics.
arXiv Detail & Related papers (2024-02-17T16:16:24Z) - The Impact of Differential Feature Under-reporting on Algorithmic Fairness [86.275300739926]
We present an analytically tractable model of differential feature under-reporting.
We then use to characterize the impact of this kind of data bias on algorithmic fairness.
Our results show that, in real world data settings, under-reporting typically leads to increasing disparities.
arXiv Detail & Related papers (2024-01-16T19:16:22Z) - A Bayesian Spatial Model to Correct Under-Reporting in Urban
Crowdsourcing [1.850972250657274]
Decision-makers often observe the occurrence of events through a reporting process.
We show how to overcome this challenge by leveraging the fact that events are spatially correlated.
arXiv Detail & Related papers (2023-12-18T23:40:56Z) - Mitigating Relational Bias on Knowledge Graphs [51.346018842327865]
We propose Fair-KGNN, a framework that simultaneously alleviates multi-hop bias and preserves the proximity information of entity-to-relation in knowledge graphs.
We develop two instances of Fair-KGNN incorporating with two state-of-the-art KGNN models, RGCN and CompGCN, to mitigate gender-occupation and nationality-salary bias.
arXiv Detail & Related papers (2022-11-26T05:55:34Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - Quantifying Spatial Under-reporting Disparities in Resident
Crowdsourcing [5.701305404173138]
We develop a method to identify reporting delays without using external ground-truth data.
We apply our method to over 100,000 resident reports made in New York City and to over 900,000 reports made in Chicago.
arXiv Detail & Related papers (2022-04-19T02:54:16Z) - Equality of opportunity in travel behavior prediction with deep neural
networks and discrete choice models [3.4806267677524896]
This study introduces an important missing dimension - computational fairness - to travel behavior analysis.
We first operationalize computational fairness by equality of opportunity, then differentiate between the bias inherent in data and the bias introduced by modeling.
arXiv Detail & Related papers (2021-09-25T19:02:23Z) - Towards Measuring Bias in Image Classification [61.802949761385]
Convolutional Neural Networks (CNN) have become state-of-the-art for the main computer vision tasks.
However, due to the complex structure their decisions are hard to understand which limits their use in some context of the industrial world.
We present a systematic approach to uncover data bias by means of attribution maps.
arXiv Detail & Related papers (2021-07-01T10:50:39Z) - Showing Your Work Doesn't Always Work [73.63200097493576]
"Show Your Work: Improved Reporting of Experimental Results" advocates for reporting the expected validation effectiveness of the best-tuned model.
We analytically show that their estimator is biased and uses error-prone assumptions.
We derive an unbiased alternative and bolster our claims with empirical evidence from statistical simulation.
arXiv Detail & Related papers (2020-04-28T17:59:01Z) - A Comparative Study on Crime in Denver City Based on Machine Learning
and Data Mining [0.0]
I analyzed a real-world crime and accident dataset of Denver county, USA, from January 2014 to May 2019.
This project aims to predict and highlights the trends of occurrence that will, in return, support the law enforcement agencies and government to discover the preventive measures.
The outcomes are captured using two popular test methods: train-test split, and k-fold crossvalidation.
arXiv Detail & Related papers (2020-01-09T01:36:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.