Situated Ground Truths: Enhancing Bias-Aware AI by Situating Data Labels with SituAnnotate
- URL: http://arxiv.org/abs/2406.07583v1
- Date: Mon, 10 Jun 2024 09:33:13 GMT
- Title: Situated Ground Truths: Enhancing Bias-Aware AI by Situating Data Labels with SituAnnotate
- Authors: Delfina Sol Martinez Pandiani, Valentina Presutti,
- Abstract summary: SituAnnotate is a novel ontology-based approach to structured and context-aware data annotation.
It aims to anchor the ground truth data employed in training AI systems within the contextual and culturally-bound situations.
As a method to create, query, and compare label-based datasets, SituAnnotate empowers downstream AI systems to undergo training with explicit consideration of context and cultural bias.
- Score: 0.1843404256219181
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In the contemporary world of AI and data-driven applications, supervised machines often derive their understanding, which they mimic and reproduce, through annotations--typically conveyed in the form of words or labels. However, such annotations are often divorced from or lack contextual information, and as such hold the potential to inadvertently introduce biases when subsequently used for training. This paper introduces SituAnnotate, a novel ontology explicitly crafted for 'situated grounding,' aiming to anchor the ground truth data employed in training AI systems within the contextual and culturally-bound situations from which those ground truths emerge. SituAnnotate offers an ontology-based approach to structured and context-aware data annotation, addressing potential bias issues associated with isolated annotations. Its representational power encompasses situational context, including annotator details, timing, location, remuneration schemes, annotation roles, and more, ensuring semantic richness. Aligned with the foundational Dolce Ultralight ontology, it provides a robust and consistent framework for knowledge representation. As a method to create, query, and compare label-based datasets, SituAnnotate empowers downstream AI systems to undergo training with explicit consideration of context and cultural bias, laying the groundwork for enhanced system interpretability and adaptability, and enabling AI models to align with a multitude of cultural contexts and viewpoints.
Related papers
- Exploiting Contextual Uncertainty of Visual Data for Efficient Training of Deep Models [0.65268245109828]
We introduce the notion of contextual diversity for active learning CDAL.
We propose a data repair algorithm to curate contextually fair data to reduce model bias.
We are working on developing image retrieval system for wildlife camera trap images and reliable warning system for poor quality rural roads.
arXiv Detail & Related papers (2024-11-04T09:43:33Z) - Leveraging Ontologies to Document Bias in Data [1.0635248457021496]
Doc-BiasO is a resource that aims to create an integrated vocabulary of biases defined in the textitfair-ML literature and their measures.
Our main objective is to contribute towards clarifying existing terminology on bias research as it rapidly expands to all areas of AI.
arXiv Detail & Related papers (2024-06-29T18:41:07Z) - Data Set Terminology of Deep Learning in Medicine: A Historical Review and Recommendation [0.7897552065199818]
Medicine and deep learning-based artificial intelligence engineering represent two distinct fields each with decades of published history.
With such history comes a set of terminology that has a specific way in which it is applied.
This review aims to give historical context for these terms, accentuate the importance of clarity when these terms are used in medical AI contexts.
arXiv Detail & Related papers (2024-04-30T07:07:45Z) - Capturing Pertinent Symbolic Features for Enhanced Content-Based
Misinformation Detection [0.0]
The detection of misleading content presents a significant hurdle due to its extreme linguistic and domain variability.
This paper analyzes the linguistic attributes that characterize this phenomenon and how representative of such features some of the most popular misinformation datasets are.
We demonstrate that the appropriate use of pertinent symbolic knowledge in combination with neural language models is helpful in detecting misleading content.
arXiv Detail & Related papers (2024-01-29T16:42:34Z) - Foundational Models Defining a New Era in Vision: A Survey and Outlook [151.49434496615427]
Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world.
The models learned to bridge the gap between such modalities coupled with large-scale training data facilitate contextual reasoning, generalization, and prompt capabilities at test time.
The output of such models can be modified through human-provided prompts without retraining, e.g., segmenting a particular object by providing a bounding box, having interactive dialogues by asking questions about an image or video scene or manipulating the robot's behavior through language instructions.
arXiv Detail & Related papers (2023-07-25T17:59:18Z) - Gradient Imitation Reinforcement Learning for General Low-Resource
Information Extraction [80.64518530825801]
We develop a Gradient Reinforcement Learning (GIRL) method to encourage pseudo-labeled data to imitate the gradient descent direction on labeled data.
We also leverage GIRL to solve all IE sub-tasks (named entity recognition, relation extraction, and event extraction) in low-resource settings.
arXiv Detail & Related papers (2022-11-11T05:37:19Z) - Contextual information integration for stance detection via
cross-attention [59.662413798388485]
Stance detection deals with identifying an author's stance towards a target.
Most existing stance detection models are limited because they do not consider relevant contextual information.
We propose an approach to integrate contextual information as text.
arXiv Detail & Related papers (2022-11-03T15:04:29Z) - Schema-aware Reference as Prompt Improves Data-Efficient Knowledge Graph
Construction [57.854498238624366]
We propose a retrieval-augmented approach, which retrieves schema-aware Reference As Prompt (RAP) for data-efficient knowledge graph construction.
RAP can dynamically leverage schema and knowledge inherited from human-annotated and weak-supervised data as a prompt for each sample.
arXiv Detail & Related papers (2022-10-19T16:40:28Z) - Learning Attention-based Representations from Multiple Patterns for
Relation Prediction in Knowledge Graphs [2.4028383570062606]
AEMP is a novel model for learning contextualized representations by acquiring entities' context information.
AEMP either outperforms or competes with state-of-the-art relation prediction methods.
arXiv Detail & Related papers (2022-06-07T10:53:35Z) - Knowledge Graph Augmented Network Towards Multiview Representation
Learning for Aspect-based Sentiment Analysis [96.53859361560505]
We propose a knowledge graph augmented network (KGAN) to incorporate external knowledge with explicitly syntactic and contextual information.
KGAN captures the sentiment feature representations from multiple perspectives, i.e., context-, syntax- and knowledge-based.
Experiments on three popular ABSA benchmarks demonstrate the effectiveness and robustness of our KGAN.
arXiv Detail & Related papers (2022-01-13T08:25:53Z) - Out of Context: A New Clue for Context Modeling of Aspect-based
Sentiment Analysis [54.735400754548635]
ABSA aims to predict the sentiment expressed in a review with respect to a given aspect.
The given aspect should be considered as a new clue out of context in the context modeling process.
We design several aspect-aware context encoders based on different backbones.
arXiv Detail & Related papers (2021-06-21T02:26:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.