Adapting to Skew: Imputing Spatiotemporal Urban Data with 3D Partial
Convolutions and Biased Masking
- URL: http://arxiv.org/abs/2301.04233v1
- Date: Tue, 10 Jan 2023 22:44:22 GMT
- Title: Adapting to Skew: Imputing Spatiotemporal Urban Data with 3D Partial
Convolutions and Biased Masking
- Authors: Bin Han, Bill Howe
- Abstract summary: Missing regions in urban data can be caused by sensor or software failures, data quality issues, interference from weather events, incomplete data collection, or varying data use regulations.
We adapt computer vision techniques for image inpainting to operate on 3D histograms (2D space + 1D time) commonly used for data exchange in urban settings.
We show that the core model is effective qualitatively and quantitatively, and that biased masking during training reduces error in a variety of scenarios.
- Score: 13.94102520443797
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We adapt image inpainting techniques to impute large, irregular missing
regions in urban settings characterized by sparsity, variance in both space and
time, and anomalous events. Missing regions in urban data can be caused by
sensor or software failures, data quality issues, interference from weather
events, incomplete data collection, or varying data use regulations; any
missing data can render the entire dataset unusable for downstream
applications. To ensure coverage and utility, we adapt computer vision
techniques for image inpainting to operate on 3D histograms (2D space + 1D
time) commonly used for data exchange in urban settings.
Adapting these techniques to the spatiotemporal setting requires handling
skew: urban data tend to follow population density patterns (small dense
regions surrounded by large sparse areas); these patterns can dominate the
learning process and fool the model into ignoring local or transient effects.
To combat skew, we 1) train simultaneously in space and time, and 2) focus
attention on dense regions by biasing the masks used for training to the skew
in the data. We evaluate the core model and these two extensions using the NYC
taxi data and the NYC bikeshare data, simulating different conditions for
missing data. We show that the core model is effective qualitatively and
quantitatively, and that biased masking during training reduces error in a
variety of scenarios. We also articulate a tradeoff in varying the number of
timesteps per training sample: too few timesteps and the model ignores
transient events; too many timesteps and the model is slow to train with
limited performance gain.
Related papers
- Deep Temporal Deaggregation: Large-Scale Spatio-Temporal Generative Models [5.816964541847194]
We propose a transformer-based diffusion model, TDDPM, for time-series which outperforms and scales substantially better than state-of-the-art.
This is evaluated in a new comprehensive benchmark across several sequence lengths, standard datasets, and evaluation measures.
arXiv Detail & Related papers (2024-06-18T09:16:11Z) - Dynamic 3D Gaussian Fields for Urban Areas [60.64840836584623]
We present an efficient neural 3D scene representation for novel-view synthesis (NVS) in large-scale, dynamic urban areas.
We propose 4DGF, a neural scene representation that scales to large-scale dynamic urban areas.
arXiv Detail & Related papers (2024-06-05T12:07:39Z) - Diffusion-based Data Augmentation for Object Counting Problems [62.63346162144445]
We develop a pipeline that utilizes a diffusion model to generate extensive training data.
We are the first to generate images conditioned on a location dot map with a diffusion model.
Our proposed counting loss for the diffusion model effectively minimizes the discrepancies between the location dot map and the crowd images generated.
arXiv Detail & Related papers (2024-01-25T07:28:22Z) - Spatial-temporal Forecasting for Regions without Observations [13.805203053973772]
We study spatial-temporal forecasting for a region of interest without any historical observations.
We propose a model named STSM for the task.
Our key insight is to learn from the locations that resemble those in the region of interest.
arXiv Detail & Related papers (2024-01-19T06:26:05Z) - Spatiotemporal and Semantic Zero-inflated Urban Anomaly Prediction [8.340857178859768]
We propose STS to jointly capture the intra- and inter-dependencies between patterns and influential factors in three dimensions.
We use a multi-task prediction module with a customized loss function to solve the zero-inflated issue.
Experiments on two application scenarios with four real-world datasets demonstrate the superiority of STS.
arXiv Detail & Related papers (2023-04-04T06:48:07Z) - Traffic Prediction with Transfer Learning: A Mutual Information-based
Approach [11.444576186559487]
We propose TrafficTL, a cross-city traffic prediction approach that uses big data from other cities to aid data-scarce cities in traffic prediction.
TrafficTL is evaluated by comprehensive case studies on three real-world datasets and outperforms the state-of-the-art baseline by around 8 to 25 percent.
arXiv Detail & Related papers (2023-03-13T15:27:07Z) - Classification of structural building damage grades from multi-temporal
photogrammetric point clouds using a machine learning model trained on
virtual laser scanning data [58.720142291102135]
We present a novel approach to automatically assess multi-class building damage from real-world point clouds.
We use a machine learning model trained on virtual laser scanning (VLS) data.
The model yields high multi-target classification accuracies (overall accuracy: 92.0% - 95.1%)
arXiv Detail & Related papers (2023-02-24T12:04:46Z) - Averaging Spatio-temporal Signals using Optimal Transport and Soft
Alignments [110.79706180350507]
We show that our proposed loss can be used to define temporal-temporal baryechecenters as Fr'teche means duality.
Experiments on handwritten letters and brain imaging data confirm our theoretical findings.
arXiv Detail & Related papers (2022-03-11T09:46:22Z) - Artificial Dummies for Urban Dataset Augmentation [0.0]
Existing datasets for training pedestrian detectors in images suffer from limited appearance and pose variation.
This paper describes an augmentation method for controlled synthesis of urban scenes containing people.
We demonstrate that the data generated by our DummyNet improve performance of several existing person detectors across various datasets.
arXiv Detail & Related papers (2020-12-15T13:17:25Z) - Dataset Cartography: Mapping and Diagnosing Datasets with Training
Dynamics [118.75207687144817]
We introduce Data Maps, a model-based tool to characterize and diagnose datasets.
We leverage a largely ignored source of information: the behavior of the model on individual instances during training.
Our results indicate that a shift in focus from quantity to quality of data could lead to robust models and improved out-of-distribution generalization.
arXiv Detail & Related papers (2020-09-22T20:19:41Z) - Hidden Footprints: Learning Contextual Walkability from 3D Human Trails [70.01257397390361]
Current datasets only tell you where people are, not where they could be.
We first augment the set of valid, labeled walkable regions by propagating person observations between images, utilizing 3D information to create what we call hidden footprints.
We devise a training strategy designed for such sparse labels, combining a class-balanced classification loss with a contextual adversarial loss.
arXiv Detail & Related papers (2020-08-19T23:19:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.