Social media data reveals signal for public consumer perceptions
- URL: http://arxiv.org/abs/2012.13675v1
- Date: Sat, 26 Dec 2020 03:58:20 GMT
- Title: Social media data reveals signal for public consumer perceptions
- Authors: Neeti Pokhriyal, Abenezer Dara, Benjamin Valentino, Soroush Vosoughi
- Abstract summary: One of the most widely cited economic indicator is consumer confidence index (CCI)
Numerous studies in the past have focused on using social media, especially Twitter data, to predict CCI.
However, the strong correlations disappeared when those models were tested with newer data according to a recent comprehensive survey.
- Score: 6.212955085775758
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Researchers have used social media data to estimate various macroeconomic
indicators about public behaviors, mostly as a way to reduce surveying costs.
One of the most widely cited economic indicator is consumer confidence index
(CCI). Numerous studies in the past have focused on using social media,
especially Twitter data, to predict CCI. However, the strong correlations
disappeared when those models were tested with newer data according to a recent
comprehensive survey. In this work, we revisit this problem of assessing the
true potential of using social media data to measure CCI, by proposing a robust
non-parametric Bayesian modeling framework grounded in Gaussian Process
Regression (which provides both an estimate and an uncertainty associated with
it). Integral to our framework is a principled experimentation methodology that
demonstrates how digital data can be employed to reduce the frequency of
surveys, and thus periodic polling would be needed only to calibrate our model.
Via extensive experimentation we show how the choice of different
micro-decisions, such as the smoothing interval, various types of lags etc.
have an important bearing on the results. By using decadal data (2008-2019)
from Reddit, we show that both monthly and daily estimates of CCI can, indeed,
be reliably estimated at least several months in advance, and that our model
estimates are far superior to those generated by the existing methods.
Related papers
- Membership Inference Attacks against Synthetic Data through Overfitting
Detection [84.02632160692995]
We argue for a realistic MIA setting that assumes the attacker has some knowledge of the underlying data distribution.
We propose DOMIAS, a density-based MIA model that aims to infer membership by targeting local overfitting of the generative model.
arXiv Detail & Related papers (2023-02-24T11:27:39Z) - Conditional Feature Importance for Mixed Data [1.6114012813668934]
We develop a conditional predictive impact (CPI) framework with knockoff sampling.
We show that our proposed workflow controls type I error, achieves high power and is in line with results given by other conditional FI measures.
Our findings highlight the necessity of developing statistically adequate, specialized methods for mixed data.
arXiv Detail & Related papers (2022-10-06T16:52:38Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - MRCLens: an MRC Dataset Bias Detection Toolkit [82.44296974850639]
We introduce MRCLens, a toolkit that detects whether biases exist before users train the full model.
For the convenience of introducing the toolkit, we also provide a categorization of common biases in MRC.
arXiv Detail & Related papers (2022-07-18T21:05:39Z) - A Study on the Evaluation of Generative Models [19.18642459565609]
Implicit generative models, which do not return likelihood values, have become prevalent in recent years.
In this work, we study the evaluation metrics of generative models by generating a high-quality synthetic dataset.
Our study shows that while FID and IS do correlate to several f-divergences, their ranking of close models can vary considerably.
arXiv Detail & Related papers (2022-06-22T09:27:31Z) - Newer is not always better: Rethinking transferability metrics, their
peculiarities, stability and performance [5.650647159993238]
Fine-tuning of large pre-trained image and language models on small customized datasets has become increasingly popular.
We show that the statistical problems with covariance estimation drive the poor performance of H-score.
We propose a correction and recommend measuring correlation performance against relative accuracy in such settings.
arXiv Detail & Related papers (2021-10-13T17:24:12Z) - Predicting Census Survey Response Rates With Parsimonious Additive
Models and Structured Interactions [14.003044924094597]
We consider the problem of predicting survey response rates using a family of flexible and interpretable nonparametric models.
The study is motivated by the US Census Bureau's well-known ROAM application.
arXiv Detail & Related papers (2021-08-24T17:49:55Z) - Causal Inference with Corrupted Data: Measurement Error, Missing Values,
Discretization, and Differential Privacy [6.944765747195337]
We formulate a semiparametric model of causal inference with high dimensional corrupted data.
We prove consistency and Gaussian approximation by finite sample arguments.
Our analysis provides nonasymptotic theoretical contributions to matrix completion, statistical learning, and semiparametric statistics.
arXiv Detail & Related papers (2021-07-06T17:42:49Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z) - A Survey on Causal Inference [64.45536158710014]
Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy and economics.
Various causal effect estimation methods for observational data have sprung up.
arXiv Detail & Related papers (2020-02-05T21:35:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.