Novelty Detection for Election Fraud: A Case Study with Agent-Based
Simulation Data
- URL: http://arxiv.org/abs/2211.16023v1
- Date: Tue, 29 Nov 2022 08:46:36 GMT
- Title: Novelty Detection for Election Fraud: A Case Study with Agent-Based
Simulation Data
- Authors: Khurram Yamin, Nima Jadali, Dima Nazzal, Yao Xie
- Abstract summary: We generate a clean election results dataset without fraud as well as datasets with varying degrees of fraud.
The algorithm determines how similar actual election results are as compared to the predicted results from polling.
We show both the effectiveness of the simulation technique and the machine learning model in its success in identifying fraudulent regions.
- Score: 6.692240192392746
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a robust election simulation model and
independently developed election anomaly detection algorithm that demonstrates
the simulation's utility. The simulation generates artificial elections with
similar properties and trends as elections from the real world, while giving
users control and knowledge over all the important components of the elections.
We generate a clean election results dataset without fraud as well as datasets
with varying degrees of fraud. We then measure how well the algorithm is able
to successfully detect the level of fraud present. The algorithm determines how
similar actual election results are as compared to the predicted results from
polling and a regression model of other regions that have similar demographics.
We use k-means to partition electoral regions into clusters such that
demographic homogeneity is maximized among clusters. We then use a novelty
detection algorithm implemented as a one-class Support Vector Machine where the
clean data is provided in the form of polling predictions and regression
predictions. The regression predictions are built from the actual data in such
a way that the data supervises itself. We show both the effectiveness of the
simulation technique and the machine learning model in its success in
identifying fraudulent regions.
Related papers
- ElectionSim: Massive Population Election Simulation Powered by Large Language Model Driven Agents [70.17229548653852]
We introduce ElectionSim, an innovative election simulation framework based on large language models.
We present a million-level voter pool sampled from social media platforms to support accurate individual simulation.
We also introduce PPE, a poll-based presidential election benchmark to assess the performance of our framework under the U.S. presidential election scenario.
arXiv Detail & Related papers (2024-10-28T05:25:50Z) - Transfer Learning for Spatial Autoregressive Models with Application to U.S. Presidential Election Prediction [10.825562180226424]
We propose a novel transfer learning framework within the SAR model, called as tranSAR.
Our framework enhances estimation and prediction by leveraging information from similar source data.
We demonstrate our method's effectiveness in predicting outcomes in U.S. presidential swing states, where it outperforms traditional methods.
arXiv Detail & Related papers (2024-05-20T03:14:15Z) - Ahead of the Count: An Algorithm for Probabilistic Prediction of Instant Runoff (IRV) Elections [0.0]
We introduce a novel algorithm designed to predict outcomes in Instant Runoff Voting (IRV) elections.
The algorithm takes as input a set of discrete probability distributions describing vote totals for each candidate ranking.
We calculate all possible sequences of eliminations that might occur in the IRV rounds and assign a probability to each.
arXiv Detail & Related papers (2024-05-15T00:25:51Z) - Design and analysis of tweet-based election models for the 2021 Mexican
legislative election [55.41644538483948]
We use a dataset of 15 million election-related tweets in the six months preceding election day.
We find that models using data with geographical attributes determine the results of the election with better precision and accuracy than conventional polling methods.
arXiv Detail & Related papers (2023-01-02T12:40:05Z) - Credit card fraud detection - Classifier selection strategy [0.0]
Using a sample of annotated transactions, a machine learning classification algorithm learns to detect frauds.
fraud data sets are diverse and exhibit inconsistent characteristics.
We propose a data-driven classifier selection strategy for characteristic highly imbalanced fraud detection data sets.
arXiv Detail & Related papers (2022-08-25T07:13:42Z) - Expected Frequency Matrices of Elections: Computation, Geometry, and
Preference Learning [58.23459346724491]
We use the "map of elections" approach of Szufa et al. (AAMAS 2020) to analyze several well-known vote distributions.
We draw the "skeleton map" of distributions, evaluate its robustness, and analyze its properties.
arXiv Detail & Related papers (2022-05-16T17:40:22Z) - Conformal prediction for the design problem [72.14982816083297]
In many real-world deployments of machine learning, we use a prediction algorithm to choose what data to test next.
In such settings, there is a distinct type of distribution shift between the training and test data.
We introduce a method to quantify predictive uncertainty in such settings.
arXiv Detail & Related papers (2022-02-08T02:59:12Z) - Convolutional generative adversarial imputation networks for
spatio-temporal missing data in storm surge simulations [86.5302150777089]
Generative Adversarial Imputation Nets (GANs) and GAN-based techniques have attracted attention as unsupervised machine learning methods.
We name our proposed method as Con Conval Generative Adversarial Imputation Nets (Conv-GAIN)
arXiv Detail & Related papers (2021-11-03T03:50:48Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - The Essential Role of Empirical Validation in Legislative Redistricting
Simulation [0.0]
We apply a recently developed computational method that can efficiently enumerate all possible redistricting plans.
We show that this algorithm scales to a state with a couple of hundred geographical units.
arXiv Detail & Related papers (2020-06-17T20:51:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.