Community search signatures as foundation features for human-centered geospatial modeling
- URL: http://arxiv.org/abs/2410.22721v1
- Date: Wed, 30 Oct 2024 06:09:22 GMT
- Title: Community search signatures as foundation features for human-centered geospatial modeling
- Authors: Mimi Sun, Chaitanya Kamath, Mohit Agarwal, Arbaaz Muslim, Hector Yee, David Schottlander, Shailesh Bavadekar, Niv Efron, Shravya Shetty, Gautam Prasad,
- Abstract summary: We propose a novel approach for generating an aggregated and anonymized representation of search interest.
We benchmark these features using spatial datasets across multiple domains.
Our results demonstrate that these search features can be used for spatial predictions without strict temporal alignment.
- Score: 1.198203442779543
- License:
- Abstract: Aggregated relative search frequencies offer a unique composite signal reflecting people's habits, concerns, interests, intents, and general information needs, which are not found in other readily available datasets. Temporal search trends have been successfully used in time series modeling across a variety of domains such as infectious diseases, unemployment rates, and retail sales. However, most existing applications require curating specialized datasets of individual keywords, queries, or query clusters, and the search data need to be temporally aligned with the outcome variable of interest. We propose a novel approach for generating an aggregated and anonymized representation of search interest as foundation features at the community level for geospatial modeling. We benchmark these features using spatial datasets across multiple domains. In zip codes with a population greater than 3000 that cover over 95% of the contiguous US population, our models for predicting missing values in a 20% set of holdout counties achieve an average $R^2$ score of 0.74 across 21 health variables, and 0.80 across 6 demographic and environmental variables. Our results demonstrate that these search features can be used for spatial predictions without strict temporal alignment, and that the resulting models outperform spatial interpolation and state of the art methods using satellite imagery features.
Related papers
- General Geospatial Inference with a Population Dynamics Foundation Model [17.696501367579014]
Population Dynamics Foundation Model (PDFM) aims to capture relationships between diverse data modalities.
We first construct a geo-indexed dataset for postal codes and counties across the United States.
We then model this data and the complex relationships between locations using a graph neural network.
We combined the PDFM with a state-of-the-art forecasting foundation model, TimesFM, to predict unemployment and poverty.
arXiv Detail & Related papers (2024-11-11T18:32:44Z) - SpectralEarth: Training Hyperspectral Foundation Models at Scale [47.93167977587301]
We introduce SpectralEarth, a large-scale multi-temporal dataset designed to pretrain hyperspectral foundation models.
We pretrain a series of foundation models on SpectralEarth using state-of-the-art self-supervised learning (SSL) algorithms.
We construct four downstream datasets for land-cover and crop-type mapping, providing benchmarks for model evaluation.
arXiv Detail & Related papers (2024-08-15T22:55:59Z) - Large Models for Time Series and Spatio-Temporal Data: A Survey and
Outlook [95.32949323258251]
Temporal data, notably time series andtemporal-temporal data, are prevalent in real-world applications.
Recent advances in large language and other foundational models have spurred increased use in time series andtemporal data mining.
arXiv Detail & Related papers (2023-10-16T09:06:00Z) - SARN: Structurally-Aware Recurrent Network for Spatio-Temporal Disaggregation [8.636014676778682]
Open data is frequently released spatially aggregated, usually to comply with privacy policies. But coarse, heterogeneous aggregations complicate coherent learning and integration for downstream AI/ML systems.
We propose an overarching model named Structurally-Aware Recurrent Network (SARN), which integrates structurally-aware spatial attention layers into the Gated Recurrent Unit (GRU) model.
For scenarios with limited historical training data, we show that a model pre-trained on one city variable can be fine-tuned for another city variable using only a few hundred samples.
arXiv Detail & Related papers (2023-06-09T21:01:29Z) - Spatial Implicit Neural Representations for Global-Scale Species Mapping [72.92028508757281]
Given a set of locations where a species has been observed, the goal is to build a model to predict whether the species is present or absent at any location.
Traditional methods struggle to take advantage of emerging large-scale crowdsourced datasets.
We use Spatial Implicit Neural Representations (SINRs) to jointly estimate the geographical range of 47k species simultaneously.
arXiv Detail & Related papers (2023-06-05T03:36:01Z) - infoVerse: A Universal Framework for Dataset Characterization with
Multidimensional Meta-information [68.76707843019886]
infoVerse is a universal framework for dataset characterization.
infoVerse captures multidimensional characteristics of datasets by incorporating various model-driven meta-information.
In three real-world applications (data pruning, active learning, and data annotation), the samples chosen on infoVerse space consistently outperform strong baselines.
arXiv Detail & Related papers (2023-05-30T18:12:48Z) - RF-Next: Efficient Receptive Field Search for Convolutional Neural
Networks [86.6139619721343]
We propose to find better receptive field combinations through a global-to-local search scheme.
Our search scheme exploits both global search to find the coarse combinations and local search to get the refined receptive field combinations.
Our RF-Next models, plugging receptive field search to various models, boost the performance on many tasks.
arXiv Detail & Related papers (2022-06-14T06:56:26Z) - An Adaptive Federated Relevance Framework for Spatial Temporal Graph
Learning [14.353798949041698]
We propose an adaptive federated relevance framework, namely FedRel, for spatial-temporal graph learning.
The core Dynamic Inter-Intra Graph (DIIG) module in the framework is able to use these features to generate the spatial-temporal graphs.
To improve the model generalization ability and performance while preserving the local data privacy, we also design a relevance-driven federated learning module.
arXiv Detail & Related papers (2022-06-07T16:12:17Z) - Domain-Adversarial Training of Self-Attention Based Networks for Land
Cover Classification using Multi-temporal Sentinel-2 Satellite Imagery [0.0]
Most practical applications cannot rely on labeled data, and in the field, surveys are a time consuming solution.
In this paper, we investigate adversarial training of deep neural networks to bridge the domain discrepancy between distinct geographical zones.
arXiv Detail & Related papers (2021-04-01T15:45:17Z) - NASE: Learning Knowledge Graph Embedding for Link Prediction via Neural
Architecture Search [9.634626241415916]
Link prediction is the task of predicting missing connections between entities in the knowledge graph (KG)
Previous work has tried to use Automated Machine Learning (AutoML) to search for the best model for a given dataset.
We propose a novel Neural Architecture Search (NAS) framework for the link prediction task.
arXiv Detail & Related papers (2020-08-18T03:34:09Z) - Connecting the Dots: Multivariate Time Series Forecasting with Graph
Neural Networks [91.65637773358347]
We propose a general graph neural network framework designed specifically for multivariate time series data.
Our approach automatically extracts the uni-directed relations among variables through a graph learning module.
Our proposed model outperforms the state-of-the-art baseline methods on 3 of 4 benchmark datasets.
arXiv Detail & Related papers (2020-05-24T04:02:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.