The URW-KG: a Resource for Tackling the Underrepresentation of
non-Western Writers
- URL: http://arxiv.org/abs/2212.13104v1
- Date: Wed, 21 Dec 2022 07:53:26 GMT
- Title: The URW-KG: a Resource for Tackling the Underrepresentation of
non-Western Writers
- Authors: Marco Antonio Stranisci, Giuseppe Spillo, Cataldo Musto, Viviana
Patti, Rossana Damiano
- Abstract summary: We present the Under-Represented Writers Knowledge Graph (URW-KG), a resource designed to explore and possibly amend this lack of representation.
The integrated information encoded in the graph allows scholars and users to be more easily exposed to non-Western literary works and authors.
- Score: 2.983639510410386
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Digital media have enabled the access to unprecedented literary knowledge.
Authors, readers, and scholars are now able to discover and share an increasing
amount of information about books and their authors. Notwithstanding, digital
archives are still unbalanced: writers from non-Western countries are less
represented, and such a condition leads to the perpetration of old forms of
discrimination. In this paper, we present the Under-Represented Writers
Knowledge Graph (URW-KG), a resource designed to explore and possibly amend
this lack of representation by gathering and mapping information about works
and authors from Wikidata and three other sources: Open Library, Goodreads, and
Google Books. The experiments based on KG embeddings showed that the integrated
information encoded in the graph allows scholars and users to be more easily
exposed to non-Western literary works and authors with respect to Wikidata
alone. This opens to the development of fairer and effective tools for author
discovery and exploration.
Related papers
- On the culture of open access: the Sci-hub paradox [0.0]
Shadow libraries are online collections of copyrighted publications that have been made available for free without the permission of the copyright holders.
This study shows that OA publications, including those in fully OA journals, receive more citations than their subscription-based counterparts.
The introduction of a distinction between those accessible or not via the Scihub platform among subscription-based suggest that the generalization of its use cancels the positive effect of OA publishing.
arXiv Detail & Related papers (2023-08-30T07:50:56Z) - The World Literature Knowledge Graph [2.9441626898733153]
The World Literature Knowledge Graph is a semantic resource containing 194,346 writers and 965,210 works.
The knowledge graph integrates information about the reception of literary works gathered from 3 different communities of readers.
arXiv Detail & Related papers (2023-07-31T13:41:31Z) - PubGraph: A Large-Scale Scientific Knowledge Graph [11.240833731512609]
PubGraph is a new resource for studying scientific progress that takes the form of a large-scale knowledge graph.
PubGraph is comprehensive and unifies data from various sources, including Wikidata, OpenAlex, and Semantic Scholar.
We create several large-scale benchmarks extracted from PubGraph for the core task of knowledge graph completion.
arXiv Detail & Related papers (2023-02-04T20:03:55Z) - The Semantic Scholar Open Data Platform [79.4493235243312]
Semantic Scholar (S2) is an open data platform and website aimed at accelerating science by helping scholars discover and understand scientific literature.
We combine public and proprietary data sources using state-of-the-art techniques for scholarly PDF content extraction and automatic knowledge graph construction.
The graph includes advanced semantic features such as structurally parsed text, natural language summaries, and vector embeddings.
arXiv Detail & Related papers (2023-01-24T17:13:08Z) - Geographic Citation Gaps in NLP Research [63.13508571014673]
This work asks a series of questions on the relationship between geographical location and publication success.
We first created a dataset of 70,000 papers from the ACL Anthology, extracted their meta-information, and generated their citation network.
We show that not only are there substantial geographical disparities in paper acceptance and citation but also that these disparities persist even when controlling for a number of variables such as venue of publication and sub-field of NLP.
arXiv Detail & Related papers (2022-10-26T02:25:23Z) - DeepShovel: An Online Collaborative Platform for Data Extraction in
Geoscience Literature with AI Assistance [48.55345030503826]
Geoscientists need to read a huge amount of literature to locate, extract, and aggregate relevant results and data.
DeepShovel is a publicly-available AI-assisted data extraction system to support their needs.
A follow-up user evaluation with 14 researchers suggested DeepShovel improved users' efficiency of data extraction for building scientific databases.
arXiv Detail & Related papers (2022-02-21T12:18:08Z) - Assessing the quality of sources in Wikidata across languages: a hybrid
approach [64.05097584373979]
We run a series of microtasks experiments to evaluate a large corpus of references, sampled from Wikidata triples with labels in several languages.
We use a consolidated, curated version of the crowdsourced assessments to train several machine learning models to scale up the analysis to the whole of Wikidata.
The findings help us ascertain the quality of references in Wikidata, and identify common challenges in defining and capturing the quality of user-generated multilingual structured data on the web.
arXiv Detail & Related papers (2021-09-20T10:06:46Z) - Bridger: Toward Bursting Scientific Filter Bubbles and Boosting
Innovation via Novel Author Discovery [22.839876884227536]
Bridger is a system for facilitating discovery of scholars and their work.
We construct a faceted representation of authors using information extracted from their papers and inferred personas.
We develop an approach that locates commonalities and contrasts between scientists.
arXiv Detail & Related papers (2021-08-12T11:24:23Z) - How to Train Your Agent to Read and Write [52.24605794920856]
Reading and writing research papers is one of the most privileged abilities that a qualified researcher should master.
It would be fascinating if we could train an intelligent agent to help people read and summarize papers, and perhaps even discover and exploit the potential knowledge clues to write novel papers.
We propose a Deep ReAder-Writer (DRAW) network, which consists of a textitReader that can extract knowledge graphs (KGs) from input paragraphs and discover potential knowledge, a graph-to-text textitWriter that generates a novel paragraph, and a textit
arXiv Detail & Related papers (2021-01-04T12:22:04Z) - MedLatinEpi and MedLatinLit: Two Datasets for the Computational
Authorship Analysis of Medieval Latin Texts [72.16295267480838]
We present and make available MedLatinEpi and MedLatinLit, two datasets of medieval Latin texts to be used in research on computational authorship analysis.
MedLatinEpi and MedLatinLit consist of 294 and 30 curated texts, respectively, labelled by author; MedLatinEpi texts are of epistolary nature, while MedLatinLit texts consist of literary comments and treatises about various subjects.
arXiv Detail & Related papers (2020-06-22T14:22:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.