Learning to Characterize Matching Experts
- URL: http://arxiv.org/abs/2012.01229v1
- Date: Wed, 2 Dec 2020 14:16:38 GMT
- Title: Learning to Characterize Matching Experts
- Authors: Roee Shraga, Ofra Amir, Avigdor Gal
- Abstract summary: We characterize human matching experts, those humans whose proposed correspondences can mostly be trusted to be valid.
We show that our approach can improve matching results by filtering out inexpert matchers.
- Score: 19.246576904646172
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Matching is a task at the heart of any data integration process, aimed at
identifying correspondences among data elements. Matching problems were
traditionally solved in a semi-automatic manner, with correspondences being
generated by matching algorithms and outcomes subsequently validated by human
experts. Human-in-the-loop data integration has been recently challenged by the
introduction of big data and recent studies have analyzed obstacles to
effective human matching and validation. In this work we characterize human
matching experts, those humans whose proposed correspondences can mostly be
trusted to be valid. We provide a novel framework for characterizing matching
experts that, accompanied with a novel set of features, can be used to identify
reliable and valuable human experts. We demonstrate the usefulness of our
approach using an extensive empirical evaluation. In particular, we show that
our approach can improve matching results by filtering out inexpert matchers.
Related papers
- Leveraging Mixture of Experts for Improved Speech Deepfake Detection [53.69740463004446]
Speech deepfakes pose a significant threat to personal security and content authenticity.
We introduce a novel approach for enhancing speech deepfake detection performance using a Mixture of Experts architecture.
arXiv Detail & Related papers (2024-09-24T13:24:03Z) - Annotator in the Loop: A Case Study of In-Depth Rater Engagement to Create a Bridging Benchmark Dataset [1.825224193230824]
We describe a novel, collaborative, and iterative annotator-in-the-loop methodology for annotation.
Our findings indicate that collaborative engagement with annotators can enhance annotation methods.
arXiv Detail & Related papers (2024-08-01T19:11:08Z) - On the Universal Adversarial Perturbations for Efficient Data-free
Adversarial Detection [55.73320979733527]
We propose a data-agnostic adversarial detection framework, which induces different responses between normal and adversarial samples to UAPs.
Experimental results show that our method achieves competitive detection performance on various text classification tasks.
arXiv Detail & Related papers (2023-06-27T02:54:07Z) - A Gold Standard Dataset for the Reviewer Assignment Problem [117.59690218507565]
"Similarity score" is a numerical estimate of the expertise of a reviewer in reviewing a paper.
Our dataset consists of 477 self-reported expertise scores provided by 58 researchers.
For the task of ordering two papers in terms of their relevance for a reviewer, the error rates range from 12%-30% in easy cases to 36%-43% in hard cases.
arXiv Detail & Related papers (2023-03-23T16:15:03Z) - Personalized Decentralized Multi-Task Learning Over Dynamic
Communication Graphs [59.96266198512243]
We propose a decentralized and federated learning algorithm for tasks that are positively and negatively correlated.
Our algorithm uses gradients to calculate the correlations among tasks automatically, and dynamically adjusts the communication graph to connect mutually beneficial tasks and isolate those that may negatively impact each other.
We conduct experiments on a synthetic Gaussian dataset and a large-scale celebrity attributes (CelebA) dataset.
arXiv Detail & Related papers (2022-12-21T18:58:24Z) - A Unified Comparison of User Modeling Techniques for Predicting Data
Interaction and Detecting Exploration Bias [17.518601254380275]
We compare and rank eight user modeling algorithms based on their performance on a diverse set of four user study datasets.
Based on our findings, we highlight open challenges and new directions for analyzing user interactions and visualization provenance.
arXiv Detail & Related papers (2022-08-09T19:51:10Z) - PoWareMatch: a Quality-aware Deep Learning Approach to Improve Human
Schema Matching [20.110234122423172]
We examine a novel angle on the behavior of humans as matchers, studying match creation as a process.
We design PoWareMatch that makes use of a deep learning mechanism to calibrate and filter human matching decisions.
PoWareMatch predicts well the benefit of extending the match with an additional correspondence and generates high quality matches.
arXiv Detail & Related papers (2021-09-15T14:24:56Z) - Combining Feature and Instance Attribution to Detect Artifacts [62.63504976810927]
We propose methods to facilitate identification of training data artifacts.
We show that this proposed training-feature attribution approach can be used to uncover artifacts in training data.
We execute a small user study to evaluate whether these methods are useful to NLP researchers in practice.
arXiv Detail & Related papers (2021-07-01T09:26:13Z) - Novel Human-Object Interaction Detection via Adversarial Domain
Generalization [103.55143362926388]
We study the problem of novel human-object interaction (HOI) detection, aiming at improving the generalization ability of the model to unseen scenarios.
The challenge mainly stems from the large compositional space of objects and predicates, which leads to the lack of sufficient training data for all the object-predicate combinations.
We propose a unified framework of adversarial domain generalization to learn object-invariant features for predicate prediction.
arXiv Detail & Related papers (2020-05-22T22:02:56Z) - Search Result Clustering in Collaborative Sound Collections [17.48516881308658]
We propose a graph-based approach using audio features for clustering diverse sound collections obtained when querying large online databases.
We show that using a confidence measure for discarding inconsistent clusters improves the quality of the partitions.
arXiv Detail & Related papers (2020-04-08T13:08:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.