A Predictive Model of Digital Information Engagement: Forecasting User
Engagement With English Words by Incorporating Cognitive Biases,
Computational Linguistics and Natural Language Processing
- URL: http://arxiv.org/abs/2307.14500v1
- Date: Wed, 26 Jul 2023 20:58:47 GMT
- Title: A Predictive Model of Digital Information Engagement: Forecasting User
Engagement With English Words by Incorporating Cognitive Biases,
Computational Linguistics and Natural Language Processing
- Authors: Nimrod Dvir, Elaine Friedman, Suraj Commuri, Fan yang and Jennifer
Romano
- Abstract summary: This study introduces and empirically tests a novel predictive model for digital information engagement (IE)
The READ model integrates key cognitive biases with computational linguistics and natural language processing to develop a multidimensional perspective on information engagement.
The READ model's potential extends across various domains, including business, education, government, and healthcare.
- Score: 3.09766013093045
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study introduces and empirically tests a novel predictive model for
digital information engagement (IE) - the READ model, an acronym for the four
pivotal attributes of engaging information: Representativeness, Ease-of-use,
Affect, and Distribution. Conceptualized within the theoretical framework of
Cumulative Prospect Theory, the model integrates key cognitive biases with
computational linguistics and natural language processing to develop a
multidimensional perspective on information engagement. A rigorous testing
protocol was implemented, involving 50 randomly selected pairs of synonymous
words (100 words in total) from the WordNet database. These words' engagement
levels were evaluated through a large-scale online survey (n = 80,500) to
derive empirical IE metrics. The READ attributes for each word were then
computed and their predictive efficacy examined. The findings affirm the READ
model's robustness, accurately predicting a word's IE level and distinguishing
the more engaging word from a pair of synonyms with an 84% accuracy rate. The
READ model's potential extends across various domains, including business,
education, government, and healthcare, where it could enhance content
engagement and inform AI language model development and generative text work.
Future research should address the model's scalability and adaptability across
different domains and languages, thereby broadening its applicability and
efficacy.
Related papers
- Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning [84.94709351266557]
We focus on the trustworthiness of language models with respect to retrieval augmentation.
We deem that retrieval-augmented language models have the inherent capabilities of supplying response according to both contextual and parametric knowledge.
Inspired by aligning language models with human preference, we take the first step towards aligning retrieval-augmented language models to a status where it responds relying merely on the external evidence.
arXiv Detail & Related papers (2024-10-22T09:25:21Z) - Decomposition of surprisal: Unified computational model of ERP components in language processing [7.760815504640362]
We advance an information-theoretic model of human language processing in the brain in which incoming linguistic input is processed at first shallowly and later with more depth.
We show that the information content (surprisal) of a word in context can be decomposed into two quantities: (A) shallow surprisal, which signals shallow processing difficulty for a word, and corresponds with the N400 signal; and (B) deep surprisal, which reflects the discrepancy between shallow and deep representations, and corresponds to the P600 signal.
arXiv Detail & Related papers (2024-09-10T18:14:02Z) - Who Writes the Review, Human or AI? [0.36498648388765503]
This study proposes a methodology to accurately distinguish AI-generated and human-written book reviews.
Our approach utilizes transfer learning, enabling the model to identify generated text across different topics.
The experimental results demonstrate that it is feasible to detect the original source of text, achieving an accuracy rate of 96.86%.
arXiv Detail & Related papers (2024-05-30T17:38:44Z) - PRobELM: Plausibility Ranking Evaluation for Language Models [12.057770969325453]
PRobELM is a benchmark designed to assess language models' ability to discern more plausible scenarios through their parametric knowledge.
Our benchmark is constructed from a dataset curated from Wikidata edit histories, tailored to align the temporal bounds of the training data for the evaluated models.
arXiv Detail & Related papers (2024-04-04T21:57:11Z) - Evaluating Large Language Models Using Contrast Sets: An Experimental Approach [0.0]
We introduce an innovative technique for generating a contrast set for the Stanford Natural Language Inference dataset.
Our strategy involves the automated substitution of verbs, adverbs, and adjectives with their synonyms to preserve the original meaning of sentences.
This method aims to assess whether a model's performance is based on genuine language comprehension or simply on pattern recognition.
arXiv Detail & Related papers (2024-04-02T02:03:28Z) - Meta predictive learning model of languages in neural circuits [2.5690340428649328]
We propose a mean-field learning model within the predictive coding framework.
Our model reveals that most of the connections become deterministic after learning.
Our model provides a starting point to investigate the connection among brain computation, next-token prediction and general intelligence.
arXiv Detail & Related papers (2023-09-08T03:58:05Z) - Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language
Modelling [70.23876429382969]
We propose a benchmark that can evaluate intra-sentence discourse properties across a diverse set of NLP tasks.
Disco-Bench consists of 9 document-level testsets in the literature domain, which contain rich discourse phenomena.
For linguistic analysis, we also design a diagnostic test suite that can examine whether the target models learn discourse knowledge.
arXiv Detail & Related papers (2023-07-16T15:18:25Z) - Commonsense Knowledge Transfer for Pre-trained Language Models [83.01121484432801]
We introduce commonsense knowledge transfer, a framework to transfer the commonsense knowledge stored in a neural commonsense knowledge model to a general-purpose pre-trained language model.
It first exploits general texts to form queries for extracting commonsense knowledge from the neural commonsense knowledge model.
It then refines the language model with two self-supervised objectives: commonsense mask infilling and commonsense relation prediction.
arXiv Detail & Related papers (2023-06-04T15:44:51Z) - An Empirical Investigation of Commonsense Self-Supervision with
Knowledge Graphs [67.23285413610243]
Self-supervision based on the information extracted from large knowledge graphs has been shown to improve the generalization of language models.
We study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting language models.
arXiv Detail & Related papers (2022-05-21T19:49:04Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - TextFlint: Unified Multilingual Robustness Evaluation Toolkit for
Natural Language Processing [73.16475763422446]
We propose a multilingual robustness evaluation platform for NLP tasks (TextFlint)
It incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analysis.
TextFlint generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model's robustness.
arXiv Detail & Related papers (2021-03-21T17:20:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.