Curious Rhythms: Temporal Regularities of Wikipedia Consumption
- URL: http://arxiv.org/abs/2305.09497v3
- Date: Sat, 20 Apr 2024 16:47:43 GMT
- Title: Curious Rhythms: Temporal Regularities of Wikipedia Consumption
- Authors: Tiziano Piccardi, Martin Gerlach, Robert West,
- Abstract summary: We show that even after removing the global pattern of day-night alternation, the consumption habits of individual articles maintain strong diurnal regularities.
We investigate topical and contextual correlates of Wikipedia articles' access rhythms, finding that article topic, reader country, and access device (mobile vs. desktop) are all important predictors of daily attention patterns.
- Score: 15.686850035802667
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Wikipedia, in its role as the world's largest encyclopedia, serves a broad range of information needs. Although previous studies have noted that Wikipedia users' information needs vary throughout the day, there is to date no large-scale, quantitative study of the underlying dynamics. The present paper fills this gap by investigating temporal regularities in daily consumption patterns in a large-scale analysis of billions of timezone-corrected page requests mined from English Wikipedia's server logs, with the goal of investigating how context and time relate to the kind of information consumed. First, we show that even after removing the global pattern of day-night alternation, the consumption habits of individual articles maintain strong diurnal regularities. Then, we characterize the prototypical shapes of consumption patterns, finding a particularly strong distinction between articles preferred during the evening/night and articles preferred during working hours. Finally, we investigate topical and contextual correlates of Wikipedia articles' access rhythms, finding that article topic, reader country, and access device (mobile vs. desktop) are all important predictors of daily attention patterns. These findings shed new light on how humans seek information on the Web by focusing on Wikipedia as one of the largest open platforms for knowledge and learning, emphasizing Wikipedia's role as a rich knowledge base that fulfills information needs spread throughout the day, with implications for understanding information seeking across the globe and for designing appropriate information systems.
Related papers
- Locating Information Gaps and Narrative Inconsistencies Across Languages: A Case Study of LGBT People Portrayals on Wikipedia [49.80565462746646]
We introduce the InfoGap method -- an efficient and reliable approach to locating information gaps and inconsistencies in articles at the fact level.
We evaluate InfoGap by analyzing LGBT people's portrayals, across 2.7K biography pages on English, Russian, and French Wikipedias.
arXiv Detail & Related papers (2024-10-05T20:40:49Z) - Between News and History: Identifying Networked Topics of Collective
Attention on Wikipedia [0.0]
We develop a temporal community detection approach towards topic detection.
We apply this method to a dataset of one year of current events on Wikipedia.
We are able to resolve the topics that more strongly reflect unfolding current events vs more established knowledge.
arXiv Detail & Related papers (2022-11-14T18:36:21Z) - Mapping Process for the Task: Wikidata Statements to Text as Wikipedia
Sentences [68.8204255655161]
We propose our mapping process for the task of converting Wikidata statements to natural language text (WS2T) for Wikipedia projects at the sentence level.
The main step is to organize statements, represented as a group of quadruples and triples, and then to map them to corresponding sentences in English Wikipedia.
We evaluate the output corpus in various aspects: sentence structure analysis, noise filtering, and relationships between sentence components based on word embedding models.
arXiv Detail & Related papers (2022-10-23T08:34:33Z) - Towards Proactive Information Retrieval in Noisy Text with Wikipedia
Concepts [6.744385328015561]
This work explores how exploiting the context of a query using Wikipedia concepts can improve proactive information retrieval on noisy text.
Our experiments around a podcast segment retrieval task demonstrate that there is a clear signal of relevance in Wikipedia concepts.
We also find Wikifying the background context of a query can help disambiguate the meaning of the query, further helping proactive information retrieval.
arXiv Detail & Related papers (2022-10-18T14:12:06Z) - WikiDes: A Wikipedia-Based Dataset for Generating Short Descriptions
from Paragraphs [66.88232442007062]
We introduce WikiDes, a dataset to generate short descriptions of Wikipedia articles.
The dataset consists of over 80k English samples on 6987 topics.
Our paper shows a practical impact on Wikipedia and Wikidata since there are thousands of missing descriptions.
arXiv Detail & Related papers (2022-09-27T01:28:02Z) - The Web Is Your Oyster -- Knowledge-Intensive NLP against a Very Large
Web Corpus [76.9522248303716]
We propose a new setup for evaluating existing KI-NLP tasks in which we generalize the background corpus to a universal web snapshot.
We repurpose KILT, a standard KI-NLP benchmark initially developed for Wikipedia, and ask systems to use a subset of CCNet - the Sphere corpus.
We find that despite potential gaps of coverage, challenges of scale, lack of structure and lower quality, retrieval from Sphere enables a state-of-the-art-and-read system to match and even outperform Wikipedia-based models.
arXiv Detail & Related papers (2021-12-18T13:15:34Z) - Surfer100: Generating Surveys From Web Resources on Wikipedia-style [49.23675182917996]
We show that recent advances in pretrained language modeling can be combined for a two-stage extractive and abstractive approach for Wikipedia lead paragraph generation.
We extend this approach to generate longer Wikipedia-style summaries with sections and examine how such methods struggle in this application through detailed studies with 100 reference human-collected surveys.
arXiv Detail & Related papers (2021-12-13T02:18:01Z) - Tracking Knowledge Propagation Across Wikipedia Languages [1.8447697408534176]
We present a dataset of inter-language knowledge propagation in Wikipedia.
The dataset covers the entire 309 language editions and 33M articles.
We find that the size of language editions is associated with the speed of propagation.
arXiv Detail & Related papers (2021-03-30T18:36:13Z) - Multiple Texts as a Limiting Factor in Online Learning: Quantifying
(Dis-)similarities of Knowledge Networks across Languages [60.00219873112454]
We investigate the hypothesis that the extent to which one obtains information on a given topic through Wikipedia depends on the language in which it is consulted.
Since Wikipedia is a central part of the web-based information landscape, this indicates a language-related, linguistic bias.
The article builds a bridge between reading research, educational science, Wikipedia research and computational linguistics.
arXiv Detail & Related papers (2020-08-05T11:11:55Z) - How Inclusive Are Wikipedia's Hyperlinks in Articles Covering Polarizing
Topics? [8.035521056416242]
We focus on the influence of the interconnect topology between articles describing complementary aspects of polarizing topics.
We introduce a novel measure of exposure to diverse information to quantify users' exposure to different aspects of a topic.
We identify cases in which the network topology significantly limits the exposure of users to diverse information on the topic, encouraging users to remain in a knowledge bubble.
arXiv Detail & Related papers (2020-07-16T09:19:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.