Between News and History: Identifying Networked Topics of Collective
Attention on Wikipedia
- URL: http://arxiv.org/abs/2211.07616v2
- Date: Fri, 12 May 2023 17:17:54 GMT
- Title: Between News and History: Identifying Networked Topics of Collective
Attention on Wikipedia
- Authors: Patrick Gildersleve, Renaud Lambiotte, Taha Yasseri
- Abstract summary: We develop a temporal community detection approach towards topic detection.
We apply this method to a dataset of one year of current events on Wikipedia.
We are able to resolve the topics that more strongly reflect unfolding current events vs more established knowledge.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The digital information landscape has introduced a new dimension to
understanding how we collectively react to new information and preserve it at
the societal level. This, together with the emergence of platforms such as
Wikipedia, has challenged traditional views on the relationship between current
events and historical accounts of events, with an ever-shrinking divide between
"news" and "history". Wikipedia's place as the Internet's primary reference
work thus poses the question of how it represents both traditional
encyclopaedic knowledge and evolving important news stories. In other words,
how is information on and attention towards current events integrated into the
existing topical structures of Wikipedia? To address this we develop a temporal
community detection approach towards topic detection that takes into account
both short term dynamics of attention as well as long term article network
structures. We apply this method to a dataset of one year of current events on
Wikipedia to identify clusters distinct from those that would be found solely
from page view time series correlations or static network structure. We are
able to resolve the topics that more strongly reflect unfolding current events
vs more established knowledge by the relative importance of collective
attention dynamics vs link structures. We also offer important developments by
identifying and describing the emergent topics on Wikipedia. This work provides
a means of distinguishing how these information and attention clusters are
related to Wikipedia's twin faces of encyclopaedic knowledge and current events
-- crucial to understanding the production and consumption of knowledge in the
digital age.
Related papers
- AKEW: Assessing Knowledge Editing in the Wild [79.96813982502952]
AKEW (Assessing Knowledge Editing in the Wild) is a new practical benchmark for knowledge editing.
It fully covers three editing settings of knowledge updates: structured facts, unstructured texts as facts, and extracted triplets.
Through extensive experiments, we demonstrate the considerable gap between state-of-the-art knowledge-editing methods and practical scenarios.
arXiv Detail & Related papers (2024-02-29T07:08:34Z) - Curious Rhythms: Temporal Regularities of Wikipedia Consumption [15.686850035802667]
We show that even after removing the global pattern of day-night alternation, the consumption habits of individual articles maintain strong diurnal regularities.
We investigate topical and contextual correlates of Wikipedia articles' access rhythms, finding that article topic, reader country, and access device (mobile vs. desktop) are all important predictors of daily attention patterns.
arXiv Detail & Related papers (2023-05-16T14:48:08Z) - Towards Proactive Information Retrieval in Noisy Text with Wikipedia
Concepts [6.744385328015561]
This work explores how exploiting the context of a query using Wikipedia concepts can improve proactive information retrieval on noisy text.
Our experiments around a podcast segment retrieval task demonstrate that there is a clear signal of relevance in Wikipedia concepts.
We also find Wikifying the background context of a query can help disambiguate the meaning of the query, further helping proactive information retrieval.
arXiv Detail & Related papers (2022-10-18T14:12:06Z) - WikiDes: A Wikipedia-Based Dataset for Generating Short Descriptions
from Paragraphs [66.88232442007062]
We introduce WikiDes, a dataset to generate short descriptions of Wikipedia articles.
The dataset consists of over 80k English samples on 6987 topics.
Our paper shows a practical impact on Wikipedia and Wikidata since there are thousands of missing descriptions.
arXiv Detail & Related papers (2022-09-27T01:28:02Z) - Surfer100: Generating Surveys From Web Resources on Wikipedia-style [49.23675182917996]
We show that recent advances in pretrained language modeling can be combined for a two-stage extractive and abstractive approach for Wikipedia lead paragraph generation.
We extend this approach to generate longer Wikipedia-style summaries with sections and examine how such methods struggle in this application through detailed studies with 100 reference human-collected surveys.
arXiv Detail & Related papers (2021-12-13T02:18:01Z) - Predicting Links on Wikipedia with Anchor Text Information [0.571097144710995]
We study the transductive and the inductive tasks of link prediction on several subsets of the English Wikipedia.
We propose an appropriate evaluation sampling methodology and compare several algorithms.
arXiv Detail & Related papers (2021-05-25T07:57:57Z) - Modeling Collective Anticipation and Response on Wikipedia [1.299941371793082]
We propose a model that describes the dynamics around peaks of popularity by incorporating key features, i.e., the anticipatory growth and the decay of collective attention together with circadian rhythms.
Our work demonstrates the importance of appropriately modeling all phases of collective attention, as well as the connection between temporal patterns of attention and characteristic underlying information of the events they represent.
arXiv Detail & Related papers (2021-05-23T09:51:32Z) - Fact-driven Logical Reasoning for Machine Reading Comprehension [82.58857437343974]
We are motivated to cover both commonsense and temporary knowledge clues hierarchically.
Specifically, we propose a general formalism of knowledge units by extracting backbone constituents of the sentence.
We then construct a supergraph on top of the fact units, allowing for the benefit of sentence-level (relations among fact groups) and entity-level interactions.
arXiv Detail & Related papers (2021-05-21T13:11:13Z) - AttentionFlow: Visualising Influence in Networks of Time Series [80.61555138658578]
We present AttentionFlow, a new system to visualise networks of time series and the dynamic influence they have on one another.
We show that attention spikes in songs can be explained by external events such as major awards, or changes in the network such as the release of a new song.
AttentionFlow can be generalised to visualise networks of time series on physical infrastructures such as road networks, or natural phenomena such as weather and geological measurements.
arXiv Detail & Related papers (2021-02-03T09:44:46Z) - Multiple Texts as a Limiting Factor in Online Learning: Quantifying
(Dis-)similarities of Knowledge Networks across Languages [60.00219873112454]
We investigate the hypothesis that the extent to which one obtains information on a given topic through Wikipedia depends on the language in which it is consulted.
Since Wikipedia is a central part of the web-based information landscape, this indicates a language-related, linguistic bias.
The article builds a bridge between reading research, educational science, Wikipedia research and computational linguistics.
arXiv Detail & Related papers (2020-08-05T11:11:55Z) - Entity Extraction from Wikipedia List Pages [2.3605348648054463]
We build a large taxonomy from categories and list pages with DBpedia as a backbone.
With distant supervision, we extract training data for the identification of new entities in list pages.
We extend DBpedia with 7.5M new type statements and 3.8M new facts of high precision.
arXiv Detail & Related papers (2020-03-11T07:48:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.