Uncertainty in Automated Ontology Matching: Lessons Learned from an
Empirical Experimentation
- URL: http://arxiv.org/abs/2310.11723v1
- Date: Wed, 18 Oct 2023 05:42:51 GMT
- Title: Uncertainty in Automated Ontology Matching: Lessons Learned from an
Empirical Experimentation
- Authors: In\`es Osman, Salvatore F. Pileggi, Sadok Ben Yahia
- Abstract summary: Ontologies play a critical role in link and semantically integrate datasets via interoperability.
This paper approaches data integration from an application perspective, looking techniques based on ontology matching.
- Score: 6.491645162078057
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data integration is considered a classic research field and a pressing need
within the information science community. Ontologies play a critical role in
such a process by providing well-consolidated support to link and semantically
integrate datasets via interoperability. This paper approaches data integration
from an application perspective, looking at techniques based on ontology
matching. An ontology-based process may only be considered adequate by assuming
manual matching of different sources of information. However, since the
approach becomes unrealistic once the system scales up, automation of the
matching process becomes a compelling need. Therefore, we have conducted
experiments on actual data with the support of existing tools for automatic
ontology matching from the scientific community. Even considering a relatively
simple case study (i.e., the spatio-temporal alignment of global indicators),
outcomes clearly show significant uncertainty resulting from errors and
inaccuracies along the automated matching process. More concretely, this paper
aims to test on real-world data a bottom-up knowledge-building approach,
discuss the lessons learned from the experimental results of the case study,
and draw conclusions about uncertainty and uncertainty management in an
automated ontology matching process. While the most common evaluation metrics
clearly demonstrate the unreliability of fully automated matching solutions,
properly designed semi-supervised approaches seem to be mature for a more
generalized application.
Related papers
- Towards Explainable Automated Data Quality Enhancement without Domain Knowledge [0.0]
We propose a comprehensive framework designed to automatically assess and rectify data quality issues in any given dataset.
Our primary objective is to address three fundamental types of defects: absence, redundancy, and incoherence.
We adopt a hybrid approach that integrates statistical methods with machine learning algorithms.
arXiv Detail & Related papers (2024-09-16T10:08:05Z) - Perception Matters: Enhancing Embodied AI with Uncertainty-Aware Semantic Segmentation [24.32551050538683]
Embodied AI has made significant progress acting in unexplored environments.
We focus on dated perception models, neglect temporal aggregation, and transfer from ground truth directly to noisy perception at test time.
We address the identified problems through calibrated perception probabilities and uncertainty across aggregation and found decisions.
arXiv Detail & Related papers (2024-08-05T08:14:28Z) - Interactive Ontology Matching with Cost-Efficient Learning [2.006461411375746]
This work introduces DualLoop, an active learning method tailored to matching.
Compared to existing active learning methods, we consistently achieved better F1 scores and recall.
We report our operational performance results within the Architecture, Engineering, Construction (AEC) industry sector.
arXiv Detail & Related papers (2024-04-11T11:53:14Z) - Self-consistent Validation for Machine Learning Electronic Structure [81.54661501506185]
Method integrates machine learning with self-consistent field methods to achieve both low validation cost and interpret-ability.
This, in turn, enables exploration of the model's ability with active learning and instills confidence in its integration into real-world studies.
arXiv Detail & Related papers (2024-02-15T18:41:35Z) - ValUES: A Framework for Systematic Validation of Uncertainty Estimation in Semantic Segmentation [2.1517210693540005]
Uncertainty estimation is an essential and heavily-studied component for semantic segmentation methods.
Can data-related and model-related uncertainty really be separated in practice?
Which components of an uncertainty method are essential for real-world performance?
arXiv Detail & Related papers (2024-01-16T17:02:21Z) - A Discrepancy Aware Framework for Robust Anomaly Detection [51.710249807397695]
We present a Discrepancy Aware Framework (DAF), which demonstrates robust performance consistently with simple and cheap strategies.
Our method leverages an appearance-agnostic cue to guide the decoder in identifying defects, thereby alleviating its reliance on synthetic appearance.
Under the simple synthesis strategies, it outperforms existing methods by a large margin. Furthermore, it also achieves the state-of-the-art localization performance.
arXiv Detail & Related papers (2023-10-11T15:21:40Z) - Interactive System-wise Anomaly Detection [66.3766756452743]
Anomaly detection plays a fundamental role in various applications.
It is challenging for existing methods to handle the scenarios where the instances are systems whose characteristics are not readily observed as data.
We develop an end-to-end approach which includes an encoder-decoder module that learns system embeddings.
arXiv Detail & Related papers (2023-04-21T02:20:24Z) - Human-Centric Multimodal Machine Learning: Recent Advances and Testbed
on AI-based Recruitment [66.91538273487379]
There is a certain consensus about the need to develop AI applications with a Human-Centric approach.
Human-Centric Machine Learning needs to be developed based on four main requirements: (i) utility and social good; (ii) privacy and data ownership; (iii) transparency and accountability; and (iv) fairness in AI-driven decision-making processes.
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
arXiv Detail & Related papers (2023-02-13T16:44:44Z) - Human-in-the-Loop Disinformation Detection: Stance, Sentiment, or
Something Else? [93.91375268580806]
Both politics and pandemics have recently provided ample motivation for the development of machine learning-enabled disinformation (a.k.a. fake news) detection algorithms.
Existing literature has focused primarily on the fully-automated case, but the resulting techniques cannot reliably detect disinformation on the varied topics, sources, and time scales required for military applications.
By leveraging an already-available analyst as a human-in-the-loop, canonical machine learning techniques of sentiment analysis, aspect-based sentiment analysis, and stance detection become plausible methods to use for a partially-automated disinformation detection system.
arXiv Detail & Related papers (2021-11-09T13:30:34Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.