Predicting Survey Response with Quotation-based Modeling: A Case Study
on Favorability towards the United States
- URL: http://arxiv.org/abs/2305.14086v2
- Date: Sat, 27 May 2023 23:16:51 GMT
- Title: Predicting Survey Response with Quotation-based Modeling: A Case Study
on Favorability towards the United States
- Authors: Alireza Amirshahi, Nicolas Kirsch, Jonathan Reymond and Saleh
Baghersalimi
- Abstract summary: We propose a pioneering approach for predicting survey responses by examining quotations using machine learning.
We leverage a vast corpus of quotations from individuals across different nationalities to extract their level of favorability.
We employ a combination of natural language processing techniques and machine learning algorithms to construct a predictive model for survey responses.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The acquisition of survey responses is a crucial component in conducting
research aimed at comprehending public opinion. However, survey data collection
can be arduous, time-consuming, and expensive, with no assurance of an adequate
response rate. In this paper, we propose a pioneering approach for predicting
survey responses by examining quotations using machine learning. Our
investigation focuses on evaluating the degree of favorability towards the
United States, a topic of interest to many organizations and governments. We
leverage a vast corpus of quotations from individuals across different
nationalities and time periods to extract their level of favorability. We
employ a combination of natural language processing techniques and machine
learning algorithms to construct a predictive model for survey responses. We
investigate two scenarios: first, when no surveys have been conducted in a
country, and second when surveys have been conducted but in specific years and
do not cover all the years. Our experimental results demonstrate that our
proposed approach can predict survey responses with high accuracy. Furthermore,
we provide an exhaustive analysis of the crucial features that contributed to
the model's performance. This study has the potential to impact survey research
in the field of data science by substantially decreasing the cost and time
required to conduct surveys while simultaneously providing accurate predictions
of public opinion.
Related papers
- Specializing Large Language Models to Simulate Survey Response Distributions for Global Populations [49.908708778200115]
We are the first to specialize large language models (LLMs) for simulating survey response distributions.
As a testbed, we use country-level results from two global cultural surveys.
We devise a fine-tuning method based on first-token probabilities to minimize divergence between predicted and actual response distributions.
arXiv Detail & Related papers (2025-02-10T21:59:27Z) - Transforming Social Science Research with Transfer Learning: Social Science Survey Data Integration with AI [0.4944564023471818]
Large-N nationally representative surveys, which have profoundly shaped American politics scholarship, represent related but distinct domains.
Our study introduces a novel application of transfer learning (TL) to address these gaps.
Models pre-trained on the Cooperative Election Study dataset are fine-tuned for use in the American National Election Studies dataset.
arXiv Detail & Related papers (2025-01-11T16:01:44Z) - A Survey on Data Selection for Language Models [148.300726396877]
Data selection methods aim to determine which data points to include in a training dataset.
Deep learning is mostly driven by empirical evidence and experimentation on large-scale data is expensive.
Few organizations have the resources for extensive data selection research.
arXiv Detail & Related papers (2024-02-26T18:54:35Z) - Crowdsourced Adaptive Surveys [0.0]
This paper introduces a crowdsourced adaptive survey methodology (CSAS)
The method converts open-ended text provided by participants into survey items and applies a multi-armed bandit algorithm to determine which questions should be prioritized in the survey.
I conclude by highlighting CSAS's potential to bridge conceptual gaps between researchers and participants in survey research.
arXiv Detail & Related papers (2024-01-16T04:05:25Z) - Questioning the Survey Responses of Large Language Models [25.14481433176348]
We critically examine the methodology on the basis of the well-established American Community Survey by the U.S. Census Bureau.
We establish two dominant patterns. First, models' responses are governed by ordering and labeling biases, for example, towards survey responses labeled with the letter "A"
Second, when adjusting for these systematic biases through randomized answer ordering, models across the board trend towards uniformly random survey responses.
arXiv Detail & Related papers (2023-06-13T17:48:27Z) - Open vs Closed-ended questions in attitudinal surveys -- comparing,
combining, and interpreting using natural language processing [3.867363075280544]
Topic Modeling could significantly reduce the time to extract information from open-ended responses.
Our research uses Topic Modeling to extract information from open-ended questions and compare its performance with closed-ended responses.
arXiv Detail & Related papers (2022-05-03T06:01:03Z) - The Impact of Algorithmic Risk Assessments on Human Predictions and its
Analysis via Crowdsourcing Studies [79.66833203975729]
We conduct a vignette study in which laypersons are tasked with predicting future re-arrests.
Our key findings are as follows: Participants often predict that an offender will be rearrested even when they deem the likelihood of re-arrest to be well below 50%.
Judicial decisions, unlike participants' predictions, depend in part on factors that are to the likelihood of re-arrest.
arXiv Detail & Related papers (2021-09-03T11:09:10Z) - A Survey on Text Classification: From Shallow to Deep Learning [83.47804123133719]
The last decade has seen a surge of research in this area due to the unprecedented success of deep learning.
This paper fills the gap by reviewing the state-of-the-art approaches from 1961 to 2021.
We create a taxonomy for text classification according to the text involved and the models used for feature extraction and classification.
arXiv Detail & Related papers (2020-08-02T00:09:03Z) - Electoral Forecasting Using a Novel Temporal Attenuation Model:
Predicting the US Presidential Elections [91.3755431537592]
We develop a novel macro-scale temporal attenuation (TA) model, which uses pre-election poll data to improve forecasting accuracy.
Our hypothesis is that the timing of publicizing opinion polls plays a significant role in how opinion oscillates, especially right before elections.
We present two different implementations of the TA model, which accumulate an average forecasting error of 2.8-3.28 points over the 48-year period.
arXiv Detail & Related papers (2020-04-30T09:21:52Z) - A Survey on Causal Inference [64.45536158710014]
Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy and economics.
Various causal effect estimation methods for observational data have sprung up.
arXiv Detail & Related papers (2020-02-05T21:35:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.