O-Dang! The Ontology of Dangerous Speech Messages
- URL: http://arxiv.org/abs/2207.10652v1
- Date: Wed, 13 Jul 2022 11:50:05 GMT
- Title: O-Dang! The Ontology of Dangerous Speech Messages
- Authors: Marco A. Stranisci, Simona Frenda, Mirko Lai, Oscar Araque, Alessandra
T. Cignarella, Valerio Basile, Viviana Patti, Cristina Bosco
- Abstract summary: We present O-Dang!: The Ontology of Dangerous Speech Messages, a systematic and interoperable Knowledge Graph (KG)
O-Dang! is designed to gather and organize Italian datasets into a structured KG, according to the principles shared within the Linguistic Linked Open Data community.
It provides a model for encoding both gold standard and single-annotator labels in the KG.
- Score: 53.15616413153125
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Inside the NLP community there is a considerable amount of language resources
created, annotated and released every day with the aim of studying specific
linguistic phenomena. Despite a variety of attempts in order to organize such
resources has been carried on, a lack of systematic methods and of possible
interoperability between resources are still present. Furthermore, when storing
linguistic information, still nowadays, the most common practice is the concept
of "gold standard", which is in contrast with recent trends in NLP that aim at
stressing the importance of different subjectivities and points of view when
training machine learning and deep learning methods. In this paper we present
O-Dang!: The Ontology of Dangerous Speech Messages, a systematic and
interoperable Knowledge Graph (KG) for the collection of linguistic annotated
data. O-Dang! is designed to gather and organize Italian datasets into a
structured KG, according to the principles shared within the Linguistic Linked
Open Data community. The ontology has also been designed to account for a
perspectivist approach, since it provides a model for encoding both gold
standard and single-annotator labels in the KG. The paper is structured as
follows. In Section 1 the motivations of our work are outlined. Section 2
describes the O-Dang! Ontology, that provides a common semantic model for the
integration of datasets in the KG. The Ontology Population stage with
information about corpora, users, and annotations is presented in Section 3.
Finally, in Section 4 an analysis of offensiveness across corpora is provided
as a first case study for the resource.
Related papers
- Leveraging Ontologies to Document Bias in Data [1.0635248457021496]
Doc-BiasO is a resource that aims to create an integrated vocabulary of biases defined in the textitfair-ML literature and their measures.
Our main objective is to contribute towards clarifying existing terminology on bias research as it rapidly expands to all areas of AI.
arXiv Detail & Related papers (2024-06-29T18:41:07Z) - EventGround: Narrative Reasoning by Grounding to Eventuality-centric Knowledge Graphs [41.928535719157054]
We propose an initial comprehensive framework called EventGround to tackle the problem of grounding free-texts to eventuality-centric knowledge graphs.
We provide simple yet effective parsing and partial information extraction methods to tackle these problems.
Our framework, incorporating grounded knowledge, achieves state-of-the-art performance while providing interpretable evidence.
arXiv Detail & Related papers (2024-03-30T01:16:37Z) - Natural Language Processing for Dialects of a Language: A Survey [56.93337350526933]
State-of-the-art natural language processing (NLP) models are trained on massive training corpora, and report a superlative performance on evaluation datasets.
This survey delves into an important attribute of these datasets: the dialect of a language.
Motivated by the performance degradation of NLP models for dialectic datasets and its implications for the equity of language technologies, we survey past research in NLP for dialects in terms of datasets, and approaches.
arXiv Detail & Related papers (2024-01-11T03:04:38Z) - Variational Cross-Graph Reasoning and Adaptive Structured Semantics
Learning for Compositional Temporal Grounding [143.5927158318524]
Temporal grounding is the task of locating a specific segment from an untrimmed video according to a query sentence.
We introduce a new Compositional Temporal Grounding task and construct two new dataset splits.
We argue that the inherent structured semantics inside the videos and language is the crucial factor to achieve compositional generalization.
arXiv Detail & Related papers (2023-01-22T08:02:23Z) - Topics as Entity Clusters: Entity-based Topics from Large Language Models and Graph Neural Networks [0.6486052012623045]
We propose a novel topic clustering approach using bimodal vector representations of entities.
Our approach is better suited to working with entities in comparison to state-of-the-art models.
arXiv Detail & Related papers (2023-01-06T10:54:54Z) - Knowledge Graph Augmented Network Towards Multiview Representation
Learning for Aspect-based Sentiment Analysis [96.53859361560505]
We propose a knowledge graph augmented network (KGAN) to incorporate external knowledge with explicitly syntactic and contextual information.
KGAN captures the sentiment feature representations from multiple perspectives, i.e., context-, syntax- and knowledge-based.
Experiments on three popular ABSA benchmarks demonstrate the effectiveness and robustness of our KGAN.
arXiv Detail & Related papers (2022-01-13T08:25:53Z) - Knowledge Graph Based Synthetic Corpus Generation for Knowledge-Enhanced
Language Model Pre-training [22.534866015730664]
We verbalize the entire English Wikidata KG.
We show that verbalizing a comprehensive, encyclopedic KG like Wikidata can be used to integrate structured KGs and natural language corpora.
arXiv Detail & Related papers (2020-10-23T22:14:50Z) - Computational linguistic assessment of textbook and online learning
media by means of threshold concepts in business education [59.003956312175795]
From a linguistic perspective, threshold concepts are instances of specialized vocabularies, exhibiting particular linguistic features.
The profiles of 63 threshold concepts from business education have been investigated in textbooks, newspapers, and Wikipedia.
The three kinds of resources can indeed be distinguished in terms of their threshold concepts' profiles.
arXiv Detail & Related papers (2020-08-05T12:56:16Z) - Cross-lingual Entity Alignment with Incidental Supervision [76.66793175159192]
We propose an incidentally supervised model, JEANS, which jointly represents multilingual KGs and text corpora in a shared embedding scheme.
Experiments on benchmark datasets show that JEANS leads to promising improvement on entity alignment with incidental supervision.
arXiv Detail & Related papers (2020-05-01T01:53:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.