A Large Scale Study of Reader Interactions with Images on Wikipedia
- URL: http://arxiv.org/abs/2112.01868v1
- Date: Fri, 3 Dec 2021 12:02:59 GMT
- Title: A Large Scale Study of Reader Interactions with Images on Wikipedia
- Authors: Daniele Rama, Tiziano Piccardi, Miriam Redi, Rossano Schifanella
- Abstract summary: This study is the first large-scale analysis of how interactions with images happen on Wikipedia.
We quantify the overall engagement with images, finding that one in 29 results in a click on at least one image.
We observe that clicks on images occur more often in shorter articles and articles about visual arts or transports and biographies of less well-known people.
- Score: 2.370481325034443
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Wikipedia is the largest source of free encyclopedic knowledge and one of the
most visited sites on the Web. To increase reader understanding of the article,
Wikipedia editors add images within the text of the article's body. However,
despite their widespread usage on web platforms and the huge volume of visual
content on Wikipedia, little is known about the importance of images in the
context of free knowledge environments. To bridge this gap, we collect data
about English Wikipedia reader interactions with images during one month and
perform the first large-scale analysis of how interactions with images happen
on Wikipedia. First, we quantify the overall engagement with images, finding
that one in 29 pageviews results in a click on at least one image, one order of
magnitude higher than interactions with other types of article content. Second,
we study what factors associate with image engagement and observe that clicks
on images occur more often in shorter articles and articles about visual arts
or transports and biographies of less well-known people. Third, we look at
interactions with Wikipedia article previews and find that images help support
reader information need when navigating through the site, especially for more
popular pages. The findings in this study deepen our understanding of the role
of images for free knowledge and provide a guide for Wikipedia editors and web
user communities to enrich the world's largest source of encyclopedic
knowledge.
Related papers
- Orphan Articles: The Dark Matter of Wikipedia [13.290424502717734]
We conduct the first systematic study of orphan articles, which are articles without any incoming links from other Wikipedia articles.
We find that a surprisingly large extent of content, roughly 15% (8.8M) of all articles, is de facto invisible to readers navigating Wikipedia.
We also provide causal evidence through a quasi-experiment that adding new incoming links to orphans (de-orphanization) leads to a statistically significant increase of their visibility.
arXiv Detail & Related papers (2023-06-06T18:04:33Z) - WikiWeb2M: A Page-Level Multimodal Wikipedia Dataset [48.00110675968677]
We introduce the Wikipedia Webpage 2M (WikiWeb2M) suite; the first to retain the full set of images, text, and structure data available in a page.
WikiWeb2M can be used for tasks like page description generation, section summarization, and contextual image captioning.
arXiv Detail & Related papers (2023-05-09T13:20:59Z) - A Suite of Generative Tasks for Multi-Level Multimodal Webpage
Understanding [66.6468787004067]
We introduce the Wikipedia Webpage suite (WikiWeb2M) containing 2M pages with all of the associated image, text, and structure data.
We design a novel attention mechanism Prefix Global, which selects the most relevant image and text content as global tokens to attend to the rest of the webpage for context.
arXiv Detail & Related papers (2023-05-05T16:38:05Z) - Kuaipedia: a Large-scale Multi-modal Short-video Encyclopedia [59.47639408597319]
Kuaipedia is a large-scale multi-modal encyclopedia consisting of items, aspects, and short videos lined to them.
It was extracted from billions of videos of Kuaishou, a well-known short-video platform in China.
arXiv Detail & Related papers (2022-10-28T12:54:30Z) - The Curious Layperson: Fine-Grained Image Recognition without Expert
Labels [90.88501867321573]
We consider a new problem: fine-grained image recognition without expert annotations.
We learn a model to describe the visual appearance of objects using non-expert image descriptions.
We then train a fine-grained textual similarity model that matches image descriptions with documents on a sentence-level basis.
arXiv Detail & Related papers (2021-11-05T17:58:37Z) - Multiple Texts as a Limiting Factor in Online Learning: Quantifying
(Dis-)similarities of Knowledge Networks across Languages [60.00219873112454]
We investigate the hypothesis that the extent to which one obtains information on a given topic through Wikipedia depends on the language in which it is consulted.
Since Wikipedia is a central part of the web-based information landscape, this indicates a language-related, linguistic bias.
The article builds a bridge between reading research, educational science, Wikipedia research and computational linguistics.
arXiv Detail & Related papers (2020-08-05T11:11:55Z) - How Inclusive Are Wikipedia's Hyperlinks in Articles Covering Polarizing
Topics? [8.035521056416242]
We focus on the influence of the interconnect topology between articles describing complementary aspects of polarizing topics.
We introduce a novel measure of exposure to diverse information to quantify users' exposure to different aspects of a topic.
We identify cases in which the network topology significantly limits the exposure of users to diverse information on the topic, encouraging users to remain in a knowledge bubble.
arXiv Detail & Related papers (2020-07-16T09:19:57Z) - Placepedia: Comprehensive Place Understanding with Multi-Faceted
Annotations [79.80036503792985]
We contribute Placepedia, a large-scale place dataset with more than 35M photos from 240K unique places.
Besides the photos, each place also comes with massive multi-faceted information, e.g. GDP, population, etc.
This dataset, with its large amount of data and rich annotations, allows various studies to be conducted.
arXiv Detail & Related papers (2020-07-07T20:17:01Z) - A Deeper Investigation of the Importance of Wikipedia Links to the
Success of Search Engines [7.433327915285967]
We report the results of an investigation into the incidence of Wikipedia links in search engine results pages (SERPs)
We find that Wikipedia links are extremely common in important search contexts, appearing in 67-84% of all SERPs for common and trending queries, but less often for medical queries.
Our findings reinforce the complementary notions that (1) Wikipedia content and research has major impact outside of the Wikipedia domain and (2) powerful technologies like search engines are highly reliant on free content created by volunteers.
arXiv Detail & Related papers (2020-04-21T19:58:28Z) - Entity Extraction from Wikipedia List Pages [2.3605348648054463]
We build a large taxonomy from categories and list pages with DBpedia as a backbone.
With distant supervision, we extract training data for the identification of new entities in list pages.
We extend DBpedia with 7.5M new type statements and 3.8M new facts of high precision.
arXiv Detail & Related papers (2020-03-11T07:48:46Z) - Quantifying Engagement with Citations on Wikipedia [13.703047949952852]
One in 300 page views results in a reference click.
Clicks occur more frequently on shorter pages and on pages of lower quality.
Recent content, open access sources and references about life events are particularly popular.
arXiv Detail & Related papers (2020-01-23T15:52:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.