Related papers: Predicting United States policy outcomes with Random Forests

Predicting United States policy outcomes with Random Forests

URL: http://arxiv.org/abs/2008.07338v1
Date: Sun, 2 Aug 2020 18:06:57 GMT
Title: Predicting United States policy outcomes with Random Forests
Authors: Shawn McGuire, Charles Delahunt
Abstract summary: Two decades of U.S. government legislative outcomes, as well as the policy preferences of rich people, the general population, and diverse interest groups, were captured in a detailed dataset curated and analyzed by Gilens, Page et al. They found that the preferences of the rich correlated strongly with policy outcomes, while the preferences of the general population did not, except via a linkage with rich people's preferences. We present two primary findings, concerning respectively prediction and inference.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Two decades of U.S. government legislative outcomes, as well as the policy preferences of rich people, the general population, and diverse interest groups, were captured in a detailed dataset curated and analyzed by Gilens, Page et al. (2014). They found that the preferences of the rich correlated strongly with policy outcomes, while the preferences of the general population did not, except via a linkage with rich people's preferences. Their analysis applied the tools of classical statistical inference, in particular logistic regression. In this paper we analyze the Gilens dataset using the complementary tools of Random Forest classifiers (RFs), from Machine Learning. We present two primary findings, concerning respectively prediction and inference: (i) Holdout test sets can be predicted with approximately 70% balanced accuracy by models that consult only the preferences of rich people and a small number of powerful interest groups, as well as policy area labels. These results include retrodiction, where models trained on pre-1997 cases predicted "future" (post-1997) cases. The 20% gain in accuracy over baseline (chance), in this detailed but noisy dataset, indicates the high importance of a few wealthy players in U.S. policy outcomes, and aligns with a body of research indicating that the U.S. government has significant plutocratic tendencies. (ii) The feature selection methods of RF models identify especially salient subsets of interest groups (economic players). These can be used to further investigate the dynamics of governmental policy making, and also offer an example of the potential value of RF feature selection methods for inference on datasets such as this.

Related papers

Variable Selection Using Relative Importance Rankings [0.0]
This paper explores the potential for variable ranking and filter-based selection before model creation.<n>Specifically, we anticipate strong performance from the RI measures because they incorporate both direct and combined effects of predictors.<n>We show that predictive models built on these rankings are highly competitive, often outperforming state-of-the-art methods such as the lasso and relaxed lasso.
arXiv Detail & Related papers (2025-09-13T15:21:39Z)
The Statistical Fairness-Accuracy Frontier [50.323024516295725]
Machine learning models must balance accuracy and fairness, but these goals often conflict.<n>A useful tool for understanding this trade-off is the fairness-accuracy frontier, which characterizes the set of models that cannot be simultaneously improved in both fairness and accuracy.<n>We study the FA frontier in the finite-sample regime, showing how it deviates from its population counterpart and quantifying the worst-case gap between them.
arXiv Detail & Related papers (2025-08-25T03:01:35Z)
Comparative Analysis of Time Series Foundation Models for Demographic Forecasting: Enhancing Predictive Accuracy in US Population Dynamics [4.3009067533625895]
This study explores the application of time series foundation models to predict demographic changes in the United States.<n>We evaluate the performance of the Time Series Foundation Model (TimesFM) against traditional baselines.<n>Our experiments across six demographically diverse states demonstrate that TimesFM achieves the lowest Mean Squared Error (MSE) in 86.67% of test cases.
arXiv Detail & Related papers (2025-08-09T07:55:09Z)
Transforming Social Science Research with Transfer Learning: Social Science Survey Data Integration with AI [0.4944564023471818]
Large-N nationally representative surveys, which have profoundly shaped American politics scholarship, represent related but distinct domains. Our study introduces a novel application of transfer learning (TL) to address these gaps. Models pre-trained on the Cooperative Election Study dataset are fine-tuned for use in the American National Election Studies dataset.
arXiv Detail & Related papers (2025-01-11T16:01:44Z)
Comparing Targeting Strategies for Maximizing Social Welfare with Limited Resources [20.99198458867724]
We use data from 5 real-world RCTs in a variety of domains to empirically assess such choices. We find that risk-based targeting is almost always inferior to targeting based on even biased estimates of treatment effects.
arXiv Detail & Related papers (2024-11-11T22:36:50Z)
A Comparative Analysis of Wealth Index Predictions in Africa between three Multi-Source Inference Models [0.02730969268472861]
We analyze the International Wealth Index (IWI) predicted by Lee and Braithwaite (2022) and Esp'in-Noboa et al. (2023, alongside the Relative Wealth Index (RWI) inferred by Chi et al. (2022), across six Sub-Saharan African countries. Our analysis reveals trends and discrepancies in wealth predictions between these models. These findings raise concerns about the validity of certain models and emphasize the importance of rigorous audits for wealth prediction algorithms used in policy-making.
arXiv Detail & Related papers (2024-08-03T02:07:39Z)
Reduced-Rank Multi-objective Policy Learning and Optimization [57.978477569678844]
In practice, causal researchers do not have a single outcome in mind a priori. In government-assisted social benefit programs, policymakers collect many outcomes to understand the multidimensional nature of poverty. We present a data-driven dimensionality-reduction methodology for multiple outcomes in the context of optimal policy learning.
arXiv Detail & Related papers (2024-04-29T08:16:30Z)
ASPEST: Bridging the Gap Between Active Learning and Selective Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain. Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples. In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z)
Selecting the suitable resampling strategy for imbalanced data classification regarding dataset properties [62.997667081978825]
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class. This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples. Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class.
arXiv Detail & Related papers (2021-12-15T18:56:39Z)
Identification of Subgroups With Similar Benefits in Off-Policy Policy Evaluation [60.71312668265873]
We develop a method to balance the need for personalization with confident predictions. We show that our method can be used to form accurate predictions of heterogeneous treatment effects.
arXiv Detail & Related papers (2021-11-28T23:19:12Z)
Statistical discrimination in learning agents [64.78141757063142]
Statistical discrimination emerges in agent policies as a function of both the bias in the training population and of agent architecture. We show that less discrimination emerges with agents that use recurrent neural networks, and when their training environment has less bias.
arXiv Detail & Related papers (2021-10-21T18:28:57Z)
On Modeling Human Perceptions of Allocation Policies with Uncertain Outcomes [6.729250803621849]
We show that probability weighting can be used to make predictions about preferences over probabilistic distributions of harm and benefit. We identify optimal policies for minimizing perceived total harm and maximizing perceived total benefit that take the distorting effects of probability weighting into account.
arXiv Detail & Related papers (2021-03-10T02:22:08Z)
Inflating Topic Relevance with Ideology: A Case Study of Political Ideology Bias in Social Topic Detection Models [16.279854003220418]
We investigate the impact of political ideology biases in training data. Our work highlights the susceptibility of large, complex models to propagating the biases from human-selected input. As a way to mitigate the bias, we propose to learn a text representation that is invariant to political ideology while still judging topic relevance.
arXiv Detail & Related papers (2020-11-29T05:54:03Z)
Magnify Your Population: Statistical Downscaling to Augment the Spatial Resolution of Socioeconomic Census Data [48.7576911714538]
We present a new statistical downscaling approach to derive fine-scale estimates of key socioeconomic attributes. For each selected socioeconomic variable, a Random Forest model is trained on the source Census units and then used to generate fine-scale gridded predictions. As a case study, we apply this method to Census data in the United States, downscaling the selected socioeconomic variables available at the block group level, to a grid of 300 spatial resolution.
arXiv Detail & Related papers (2020-06-23T16:52:18Z)
Interventions for Ranking in the Presence of Implicit Bias [34.23230188778088]
Implicit bias is the unconscious attribution of particular qualities (or lack thereof) to a member from a particular social group. Rooney Rule is a constraint to improve the utility of the outcome for certain cases of the subset selection problem. We present a family of simple and interpretable constraints and show that they can optimally mitigate implicit bias.
arXiv Detail & Related papers (2020-01-23T19:11:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.