References in Wikipedia: The Editors' Perspective
- URL: http://arxiv.org/abs/2102.12511v1
- Date: Wed, 24 Feb 2021 19:04:17 GMT
- Title: References in Wikipedia: The Editors' Perspective
- Authors: Lucie-Aim\'ee Kaffee, Hady Elsahar
- Abstract summary: We explore the creation and collection of references for new Wikipedia articles from an editors' perspective.
We map out the workflow of editors when creating a new article, emphasising how they select references.
- Score: 2.0609354896832492
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: References are an essential part of Wikipedia. Each statement in Wikipedia
should be referenced. In this paper, we explore the creation and collection of
references for new Wikipedia articles from an editors' perspective. We map out
the workflow of editors when creating a new article, emphasising how they
select references.
Related papers
- Edisum: Summarizing and Explaining Wikipedia Edits at Scale [9.968020416365757]
We propose a model for recommending edit summaries generated by a language model trained to produce good edit summaries.
Our model performs on par with human editors.
More broadly, we showcase how language modeling technology can be used to support humans in maintaining one of the largest and most visible projects on the Web.
arXiv Detail & Related papers (2024-04-04T13:15:28Z) - Mapping Process for the Task: Wikidata Statements to Text as Wikipedia
Sentences [68.8204255655161]
We propose our mapping process for the task of converting Wikidata statements to natural language text (WS2T) for Wikipedia projects at the sentence level.
The main step is to organize statements, represented as a group of quadruples and triples, and then to map them to corresponding sentences in English Wikipedia.
We evaluate the output corpus in various aspects: sentence structure analysis, noise filtering, and relationships between sentence components based on word embedding models.
arXiv Detail & Related papers (2022-10-23T08:34:33Z) - Improving Wikipedia Verifiability with AI [116.69749668874493]
We develop a neural network based system, called Side, to identify Wikipedia citations that are unlikely to support their claims.
Our first citation recommendation collects over 60% more preferences than existing Wikipedia citations for the same top 10% most likely unverifiable claims.
Our results indicate that an AI-based system could be used, in tandem with humans, to improve the verifiability of Wikipedia.
arXiv Detail & Related papers (2022-07-08T15:23:29Z) - Surfer100: Generating Surveys From Web Resources on Wikipedia-style [49.23675182917996]
We show that recent advances in pretrained language modeling can be combined for a two-stage extractive and abstractive approach for Wikipedia lead paragraph generation.
We extend this approach to generate longer Wikipedia-style summaries with sections and examine how such methods struggle in this application through detailed studies with 100 reference human-collected surveys.
arXiv Detail & Related papers (2021-12-13T02:18:01Z) - Assessing the quality of sources in Wikidata across languages: a hybrid
approach [64.05097584373979]
We run a series of microtasks experiments to evaluate a large corpus of references, sampled from Wikidata triples with labels in several languages.
We use a consolidated, curated version of the crowdsourced assessments to train several machine learning models to scale up the analysis to the whole of Wikidata.
The findings help us ascertain the quality of references in Wikidata, and identify common challenges in defining and capturing the quality of user-generated multilingual structured data on the web.
arXiv Detail & Related papers (2021-09-20T10:06:46Z) - Learning Structural Edits via Incremental Tree Transformations [102.64394890816178]
We present a generic model for incremental editing of structured data (i.e., "structural edits")
Our editor learns to iteratively generate tree edits (e.g., deleting or adding a subtree) and applies them to the partially edited data.
We evaluate our proposed editor on two source code edit datasets, where results show that, with the proposed edit encoder, our editor significantly improves accuracy over previous approaches.
arXiv Detail & Related papers (2021-01-28T16:11:32Z) - 'I Updated the <ref>': The Evolution of References in the English
Wikipedia and the Implications for Altmetrics [0.0]
We present a dataset of the history of all the references (more than 55 million) ever used in the English Wikipedia until June 2019.
We have applied a new method for identifying and monitoring references in Wikipedia, so that for each reference we can provide data about associated actions: creation, modifications, deletions, and reinsertions.
arXiv Detail & Related papers (2020-10-06T23:26:12Z) - Scalable Recommendation of Wikipedia Articles to Editors Using
Representation Learning [1.8810916321241067]
We develop a scalable system on top of Graph Convolutional Networks and Doc2Vec, learning how to represent Wikipedia articles and deliver personalized recommendations for editors.
We test our model on editors' histories, predicting their most recent edits based on their prior edits.
All of the data used on this paper is publicly available, including graph embeddings for Wikipedia articles, and we release our code to support replication of our experiments.
arXiv Detail & Related papers (2020-09-24T15:56:02Z) - Multiple Texts as a Limiting Factor in Online Learning: Quantifying
(Dis-)similarities of Knowledge Networks across Languages [60.00219873112454]
We investigate the hypothesis that the extent to which one obtains information on a given topic through Wikipedia depends on the language in which it is consulted.
Since Wikipedia is a central part of the web-based information landscape, this indicates a language-related, linguistic bias.
The article builds a bridge between reading research, educational science, Wikipedia research and computational linguistics.
arXiv Detail & Related papers (2020-08-05T11:11:55Z) - Entity Extraction from Wikipedia List Pages [2.3605348648054463]
We build a large taxonomy from categories and list pages with DBpedia as a backbone.
With distant supervision, we extract training data for the identification of new entities in list pages.
We extend DBpedia with 7.5M new type statements and 3.8M new facts of high precision.
arXiv Detail & Related papers (2020-03-11T07:48:46Z) - WikiHist.html: English Wikipedia's Full Revision History in HTML Format [12.86558129722198]
We develop a parallelized architecture for parsing massive amounts of wikitext using local instances of markup.
We highlight the advantages of WikiHist.html over raw wikitext in an empirical analysis of Wikipedia's hyperlinks.
arXiv Detail & Related papers (2020-01-28T10:44:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.