Predicting Survey Response with Quotation-based Modeling: A Case Study
on Favorability towards the United States
- URL: http://arxiv.org/abs/2305.14086v2
- Date: Sat, 27 May 2023 23:16:51 GMT
- Title: Predicting Survey Response with Quotation-based Modeling: A Case Study
on Favorability towards the United States
- Authors: Alireza Amirshahi, Nicolas Kirsch, Jonathan Reymond and Saleh
Baghersalimi
- Abstract summary: We propose a pioneering approach for predicting survey responses by examining quotations using machine learning.
We leverage a vast corpus of quotations from individuals across different nationalities to extract their level of favorability.
We employ a combination of natural language processing techniques and machine learning algorithms to construct a predictive model for survey responses.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The acquisition of survey responses is a crucial component in conducting
research aimed at comprehending public opinion. However, survey data collection
can be arduous, time-consuming, and expensive, with no assurance of an adequate
response rate. In this paper, we propose a pioneering approach for predicting
survey responses by examining quotations using machine learning. Our
investigation focuses on evaluating the degree of favorability towards the
United States, a topic of interest to many organizations and governments. We
leverage a vast corpus of quotations from individuals across different
nationalities and time periods to extract their level of favorability. We
employ a combination of natural language processing techniques and machine
learning algorithms to construct a predictive model for survey responses. We
investigate two scenarios: first, when no surveys have been conducted in a
country, and second when surveys have been conducted but in specific years and
do not cover all the years. Our experimental results demonstrate that our
proposed approach can predict survey responses with high accuracy. Furthermore,
we provide an exhaustive analysis of the crucial features that contributed to
the model's performance. This study has the potential to impact survey research
in the field of data science by substantially decreasing the cost and time
required to conduct surveys while simultaneously providing accurate predictions
of public opinion.
Related papers
- Specializing Large Language Models to Simulate Survey Response Distributions for Global Populations [49.908708778200115]
We are the first to specialize large language models (LLMs) for simulating survey response distributions.
As a testbed, we use country-level results from two global cultural surveys.
We devise a fine-tuning method based on first-token probabilities to minimize divergence between predicted and actual response distributions.
arXiv Detail & Related papers (2025-02-10T21:59:27Z) - Transforming Social Science Research with Transfer Learning: Social Science Survey Data Integration with AI [0.4944564023471818]
Large-N nationally representative surveys, which have profoundly shaped American politics scholarship, represent related but distinct domains.
Our study introduces a novel application of transfer learning (TL) to address these gaps.
Models pre-trained on the Cooperative Election Study dataset are fine-tuned for use in the American National Election Studies dataset.
arXiv Detail & Related papers (2025-01-11T16:01:44Z) - A Survey on Data Selection for Language Models [148.300726396877]
Data selection methods aim to determine which data points to include in a training dataset.
Deep learning is mostly driven by empirical evidence and experimentation on large-scale data is expensive.
Few organizations have the resources for extensive data selection research.
arXiv Detail & Related papers (2024-02-26T18:54:35Z) - A step towards the integration of machine learning and small area
estimation [0.0]
We propose a predictor supported by machine learning algorithms which can be used to predict any population or subpopulation characteristics.
We study only small departures from the assumed model, to show that our proposal is a good alternative in this case as well.
What is more, we propose the method of the accuracy estimation of machine learning predictors, giving the possibility of the accuracy comparison with classic methods.
arXiv Detail & Related papers (2024-02-12T09:43:17Z) - Crowdsourced Adaptive Surveys [0.0]
This paper introduces a crowdsourced adaptive survey methodology (CSAS)
The method converts open-ended text provided by participants into survey items and applies a multi-armed bandit algorithm to determine which questions should be prioritized in the survey.
I conclude by highlighting CSAS's potential to bridge conceptual gaps between researchers and participants in survey research.
arXiv Detail & Related papers (2024-01-16T04:05:25Z) - Robust Visual Question Answering: Datasets, Methods, and Future
Challenges [23.59923999144776]
Visual question answering requires a system to provide an accurate natural language answer given an image and a natural language question.
Previous generic VQA methods often exhibit a tendency to memorize biases present in the training data rather than learning proper behaviors, such as grounding images before predicting answers.
Various datasets and debiasing methods have been proposed to evaluate and enhance the VQA robustness, respectively.
arXiv Detail & Related papers (2023-07-21T10:12:09Z) - FeedbackMap: a tool for making sense of open-ended survey responses [1.0660480034605242]
This demo introduces FeedbackMap, a web-based tool that uses natural language processing techniques to facilitate the analysis of open-ended survey responses.
We discuss the importance of examining survey results from multiple perspectives and the potential biases introduced by summarization methods.
arXiv Detail & Related papers (2023-06-26T23:38:24Z) - Questioning the Survey Responses of Large Language Models [25.14481433176348]
We critically examine the methodology on the basis of the well-established American Community Survey by the U.S. Census Bureau.
We establish two dominant patterns. First, models' responses are governed by ordering and labeling biases, for example, towards survey responses labeled with the letter "A"
Second, when adjusting for these systematic biases through randomized answer ordering, models across the board trend towards uniformly random survey responses.
arXiv Detail & Related papers (2023-06-13T17:48:27Z) - Open vs Closed-ended questions in attitudinal surveys -- comparing,
combining, and interpreting using natural language processing [3.867363075280544]
Topic Modeling could significantly reduce the time to extract information from open-ended responses.
Our research uses Topic Modeling to extract information from open-ended questions and compare its performance with closed-ended responses.
arXiv Detail & Related papers (2022-05-03T06:01:03Z) - Using Sampling to Estimate and Improve Performance of Automated Scoring
Systems with Guarantees [63.62448343531963]
We propose a combination of the existing paradigms, sampling responses to be scored by humans intelligently.
We observe significant gains in accuracy (19.80% increase on average) and quadratic weighted kappa (QWK) (25.60% on average) with a relatively small human budget.
arXiv Detail & Related papers (2021-11-17T05:00:51Z) - The Impact of Algorithmic Risk Assessments on Human Predictions and its
Analysis via Crowdsourcing Studies [79.66833203975729]
We conduct a vignette study in which laypersons are tasked with predicting future re-arrests.
Our key findings are as follows: Participants often predict that an offender will be rearrested even when they deem the likelihood of re-arrest to be well below 50%.
Judicial decisions, unlike participants' predictions, depend in part on factors that are to the likelihood of re-arrest.
arXiv Detail & Related papers (2021-09-03T11:09:10Z) - A Survey on Text Classification: From Shallow to Deep Learning [83.47804123133719]
The last decade has seen a surge of research in this area due to the unprecedented success of deep learning.
This paper fills the gap by reviewing the state-of-the-art approaches from 1961 to 2021.
We create a taxonomy for text classification according to the text involved and the models used for feature extraction and classification.
arXiv Detail & Related papers (2020-08-02T00:09:03Z) - Questionnaire analysis to define the most suitable survey for port-noise
investigation [0.0]
The paper analyses a sample of questions suitable for the specific research, chosen as part of the wide database of questionnaires internationally proposed for subjective investigations.
The questionnaire will be optimized to be distributed in the TRIPLO project (TRansports and Innovative sustainable connections between Ports and LOgistic platforms)
arXiv Detail & Related papers (2020-07-14T08:52:55Z) - Electoral Forecasting Using a Novel Temporal Attenuation Model:
Predicting the US Presidential Elections [91.3755431537592]
We develop a novel macro-scale temporal attenuation (TA) model, which uses pre-election poll data to improve forecasting accuracy.
Our hypothesis is that the timing of publicizing opinion polls plays a significant role in how opinion oscillates, especially right before elections.
We present two different implementations of the TA model, which accumulate an average forecasting error of 2.8-3.28 points over the 48-year period.
arXiv Detail & Related papers (2020-04-30T09:21:52Z) - A Survey on Causal Inference [64.45536158710014]
Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy and economics.
Various causal effect estimation methods for observational data have sprung up.
arXiv Detail & Related papers (2020-02-05T21:35:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.