Learning to Characterize Matching Experts
- URL: http://arxiv.org/abs/2012.01229v1
- Date: Wed, 2 Dec 2020 14:16:38 GMT
- Title: Learning to Characterize Matching Experts
- Authors: Roee Shraga, Ofra Amir, Avigdor Gal
- Abstract summary: We characterize human matching experts, those humans whose proposed correspondences can mostly be trusted to be valid.
We show that our approach can improve matching results by filtering out inexpert matchers.
- Score: 19.246576904646172
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Matching is a task at the heart of any data integration process, aimed at
identifying correspondences among data elements. Matching problems were
traditionally solved in a semi-automatic manner, with correspondences being
generated by matching algorithms and outcomes subsequently validated by human
experts. Human-in-the-loop data integration has been recently challenged by the
introduction of big data and recent studies have analyzed obstacles to
effective human matching and validation. In this work we characterize human
matching experts, those humans whose proposed correspondences can mostly be
trusted to be valid. We provide a novel framework for characterizing matching
experts that, accompanied with a novel set of features, can be used to identify
reliable and valuable human experts. We demonstrate the usefulness of our
approach using an extensive empirical evaluation. In particular, we show that
our approach can improve matching results by filtering out inexpert matchers.
Related papers
- Learning to Defer for Causal Discovery with Imperfect Experts [59.071731337922664]
We propose L2D-CD, a method for gauging the correctness of expert recommendations and optimally combining them with data-driven causal discovery results.
We evaluate L2D-CD on the canonical T"ubingen pairs dataset and demonstrate its superior performance compared to both the causal discovery method and the expert used in isolation.
arXiv Detail & Related papers (2025-02-18T18:55:53Z) - exHarmony: Authorship and Citations for Benchmarking the Reviewer Assignment Problem [11.763640675057076]
We develop a benchmark dataset for evaluating the reviewer assignment problem without needing explicit labels.
We benchmark various methods, including traditional lexical matching, static neural embeddings, and contextualized neural embeddings.
Our results indicate that while traditional methods perform reasonably well, contextualized embeddings trained on scholarly literature show the best performance.
arXiv Detail & Related papers (2025-02-11T16:35:04Z) - Leveraging Mixture of Experts for Improved Speech Deepfake Detection [53.69740463004446]
Speech deepfakes pose a significant threat to personal security and content authenticity.
We introduce a novel approach for enhancing speech deepfake detection performance using a Mixture of Experts architecture.
arXiv Detail & Related papers (2024-09-24T13:24:03Z) - Annotator in the Loop: A Case Study of In-Depth Rater Engagement to Create a Bridging Benchmark Dataset [1.825224193230824]
We describe a novel, collaborative, and iterative annotator-in-the-loop methodology for annotation.
Our findings indicate that collaborative engagement with annotators can enhance annotation methods.
arXiv Detail & Related papers (2024-08-01T19:11:08Z) - On the Universal Adversarial Perturbations for Efficient Data-free
Adversarial Detection [55.73320979733527]
We propose a data-agnostic adversarial detection framework, which induces different responses between normal and adversarial samples to UAPs.
Experimental results show that our method achieves competitive detection performance on various text classification tasks.
arXiv Detail & Related papers (2023-06-27T02:54:07Z) - A Gold Standard Dataset for the Reviewer Assignment Problem [117.59690218507565]
"Similarity score" is a numerical estimate of the expertise of a reviewer in reviewing a paper.
Our dataset consists of 477 self-reported expertise scores provided by 58 researchers.
For the task of ordering two papers in terms of their relevance for a reviewer, the error rates range from 12%-30% in easy cases to 36%-43% in hard cases.
arXiv Detail & Related papers (2023-03-23T16:15:03Z) - A Unified Comparison of User Modeling Techniques for Predicting Data
Interaction and Detecting Exploration Bias [17.518601254380275]
We compare and rank eight user modeling algorithms based on their performance on a diverse set of four user study datasets.
Based on our findings, we highlight open challenges and new directions for analyzing user interactions and visualization provenance.
arXiv Detail & Related papers (2022-08-09T19:51:10Z) - PoWareMatch: a Quality-aware Deep Learning Approach to Improve Human
Schema Matching [20.110234122423172]
We examine a novel angle on the behavior of humans as matchers, studying match creation as a process.
We design PoWareMatch that makes use of a deep learning mechanism to calibrate and filter human matching decisions.
PoWareMatch predicts well the benefit of extending the match with an additional correspondence and generates high quality matches.
arXiv Detail & Related papers (2021-09-15T14:24:56Z) - Combining Feature and Instance Attribution to Detect Artifacts [62.63504976810927]
We propose methods to facilitate identification of training data artifacts.
We show that this proposed training-feature attribution approach can be used to uncover artifacts in training data.
We execute a small user study to evaluate whether these methods are useful to NLP researchers in practice.
arXiv Detail & Related papers (2021-07-01T09:26:13Z) - Novel Human-Object Interaction Detection via Adversarial Domain
Generalization [103.55143362926388]
We study the problem of novel human-object interaction (HOI) detection, aiming at improving the generalization ability of the model to unseen scenarios.
The challenge mainly stems from the large compositional space of objects and predicates, which leads to the lack of sufficient training data for all the object-predicate combinations.
We propose a unified framework of adversarial domain generalization to learn object-invariant features for predicate prediction.
arXiv Detail & Related papers (2020-05-22T22:02:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.