Comparative Analysis of Time Series Foundation Models for Demographic Forecasting: Enhancing Predictive Accuracy in US Population Dynamics
- URL: http://arxiv.org/abs/2508.11680v2
- Date: Tue, 19 Aug 2025 07:06:02 GMT
- Title: Comparative Analysis of Time Series Foundation Models for Demographic Forecasting: Enhancing Predictive Accuracy in US Population Dynamics
- Authors: Aditya Akella, Jonathan Farah,
- Abstract summary: This study explores the application of time series foundation models to predict demographic changes in the United States.<n>We evaluate the performance of the Time Series Foundation Model (TimesFM) against traditional baselines.<n>Our experiments across six demographically diverse states demonstrate that TimesFM achieves the lowest Mean Squared Error (MSE) in 86.67% of test cases.
- Score: 4.3009067533625895
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Demographic shifts, influenced by globalization, economic conditions, geopolitical events, and environmental factors, pose significant challenges for policymakers and researchers. Accurate demographic forecasting is essential for informed decision-making in areas such as urban planning, healthcare, and economic policy. This study explores the application of time series foundation models to predict demographic changes in the United States using datasets from the U.S. Census Bureau and Federal Reserve Economic Data (FRED). We evaluate the performance of the Time Series Foundation Model (TimesFM) against traditional baselines including Long Short-Term Memory (LSTM) networks, Autoregressive Integrated Moving Average (ARIMA), and Linear Regression. Our experiments across six demographically diverse states demonstrate that TimesFM achieves the lowest Mean Squared Error (MSE) in 86.67% of test cases, with particularly strong performance on minority populations with sparse historical data. These findings highlight the potential of pre-trained foundation models to enhance demographic analysis and inform proactive policy interventions without requiring extensive task-specific fine-tuning.
Related papers
- Graph-based LLM over Semi-Structured Population Data for Dynamic Policy Response [13.410706647320088]
Timely and accurate analysis of population-level data is crucial for effective decision-making during public health emergencies such as the COVID-19 pandemic.<n>Massive input of semi-structured data, including structured demographic information and unstructured human feedback, poses significant challenges to conventional analysis methods.<n>We propose a novel graph-based reasoning framework that integrates large language models with structured demographic attributes and unstructured public feedback in a weakly supervised pipeline.
arXiv Detail & Related papers (2025-10-06T16:10:18Z) - MF-LLM: Simulating Population Decision Dynamics via a Mean-Field Large Language Model Framework [53.82097200295448]
Mean-Field LLM (MF-LLM) is first to incorporate mean field theory into social simulation.<n>MF-LLM models bidirectional interactions between individuals and the population through an iterative process.<n> IB-Tune is a novel fine-tuning method inspired by the Information Bottleneck principle.
arXiv Detail & Related papers (2025-04-30T12:41:51Z) - Human Preferences in Large Language Model Latent Space: A Technical Analysis on the Reliability of Synthetic Data in Voting Outcome Prediction [5.774786149181393]
We analyze how demographic attributes and prompt variations influence latent opinion mappings in large language models (LLMs)<n>We find that LLM-generated data fails to replicate the variance observed in real-world human responses.<n>In the political space, persona-to-party mappings exhibit limited differentiation, resulting in synthetic data that lacks the nuanced distribution of opinions found in survey data.
arXiv Detail & Related papers (2025-02-22T16:25:33Z) - Specializing Large Language Models to Simulate Survey Response Distributions for Global Populations [49.908708778200115]
We are the first to specialize large language models (LLMs) for simulating survey response distributions.<n>As a testbed, we use country-level results from two global cultural surveys.<n>We devise a fine-tuning method based on first-token probabilities to minimize divergence between predicted and actual response distributions.
arXiv Detail & Related papers (2025-02-10T21:59:27Z) - F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data [65.6499834212641]
We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm.
By considering domain similarities through task-specific metadata, our model improved generalization, where the excess risk decreases as the number of training tasks increases.
Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
arXiv Detail & Related papers (2024-06-23T21:28:50Z) - Using a Local Surrogate Model to Interpret Temporal Shifts in Global Annual Data [5.669106489320257]
This paper focuses on explaining changes over time in globally-sourced, annual temporal data.
We employ Local Interpretable Model-agnostic Explanations (LIME) to shed light on national happiness indices, economic freedom, and population metrics.
arXiv Detail & Related papers (2024-04-18T03:17:45Z) - Fast Model Debias with Machine Unlearning [54.32026474971696]
Deep neural networks might behave in a biased manner in many real-world scenarios.
Existing debiasing methods suffer from high costs in bias labeling or model re-training.
We propose a fast model debiasing framework (FMD) which offers an efficient approach to identify, evaluate and remove biases.
arXiv Detail & Related papers (2023-10-19T08:10:57Z) - A fairness assessment of mobility-based COVID-19 case prediction models [0.0]
We tested the hypothesis that bias in the mobility data used to train the predictive models might lead to unfairly less accurate predictions for certain demographic groups.
Specifically, the models tend to favor large, highly educated, wealthy young, urban, and non-black-dominated counties.
arXiv Detail & Related papers (2022-10-08T03:43:51Z) - Data-Centric Epidemic Forecasting: A Survey [56.99209141838794]
This survey delves into various data-driven methodological and practical advancements.
We enumerate the large number of epidemiological datasets and novel data streams that are relevant to epidemic forecasting.
We also discuss experiences and challenges that arise in real-world deployment of these forecasting systems.
arXiv Detail & Related papers (2022-07-19T16:15:11Z) - Aggregate Learning for Mixed Frequency Data [0.0]
We propose a mixed-temporal aggregate learning model that predicts economic indicators for smaller areas in real-time.
We find that the proposed model predicts (i) the regional heterogeneity of the labor market condition and (ii) the rapidly changing economic status.
The model can be applied to various tasks, especially economic analysis.
arXiv Detail & Related papers (2021-05-20T08:12:43Z) - Predicting United States policy outcomes with Random Forests [0.0]
Two decades of U.S. government legislative outcomes, as well as the policy preferences of rich people, the general population, and diverse interest groups, were captured in a detailed dataset curated and analyzed by Gilens, Page et al.
They found that the preferences of the rich correlated strongly with policy outcomes, while the preferences of the general population did not, except via a linkage with rich people's preferences.
We present two primary findings, concerning respectively prediction and inference.
arXiv Detail & Related papers (2020-08-02T18:06:57Z) - Magnify Your Population: Statistical Downscaling to Augment the Spatial
Resolution of Socioeconomic Census Data [48.7576911714538]
We present a new statistical downscaling approach to derive fine-scale estimates of key socioeconomic attributes.
For each selected socioeconomic variable, a Random Forest model is trained on the source Census units and then used to generate fine-scale gridded predictions.
As a case study, we apply this method to Census data in the United States, downscaling the selected socioeconomic variables available at the block group level, to a grid of 300 spatial resolution.
arXiv Detail & Related papers (2020-06-23T16:52:18Z) - Electoral Forecasting Using a Novel Temporal Attenuation Model:
Predicting the US Presidential Elections [91.3755431537592]
We develop a novel macro-scale temporal attenuation (TA) model, which uses pre-election poll data to improve forecasting accuracy.
Our hypothesis is that the timing of publicizing opinion polls plays a significant role in how opinion oscillates, especially right before elections.
We present two different implementations of the TA model, which accumulate an average forecasting error of 2.8-3.28 points over the 48-year period.
arXiv Detail & Related papers (2020-04-30T09:21:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.