Related papers: PoWareMatch: a Quality-aware Deep Learning Approach to Improve Human Schema Matching

PoWareMatch: a Quality-aware Deep Learning Approach to Improve Human Schema Matching

URL: http://arxiv.org/abs/2109.07321v1
Date: Wed, 15 Sep 2021 14:24:56 GMT
Title: PoWareMatch: a Quality-aware Deep Learning Approach to Improve Human Schema Matching
Authors: Roee Shraga, Avigdor Gal
Abstract summary: We examine a novel angle on the behavior of humans as matchers, studying match creation as a process. We design PoWareMatch that makes use of a deep learning mechanism to calibrate and filter human matching decisions. PoWareMatch predicts well the benefit of extending the match with an additional correspondence and generates high quality matches.
Score: 20.110234122423172
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Schema matching is a core task of any data integration process. Being investigated in the fields of databases, AI, Semantic Web and data mining for many years, the main challenge remains the ability to generate quality matches among data concepts (e.g., database attributes). In this work, we examine a novel angle on the behavior of humans as matchers, studying match creation as a process. We analyze the dynamics of common evaluation measures (precision, recall, and f-measure), with respect to this angle and highlight the need for unbiased matching to support this analysis. Unbiased matching, a newly defined concept that describes the common assumption that human decisions represent reliable assessments of schemata correspondences, is, however, not an inherent property of human matchers. In what follows, we design PoWareMatch that makes use of a deep learning mechanism to calibrate and filter human matching decisions adhering the quality of a match, which are then combined with algorithmic matching to generate better match results. We provide an empirical evidence, established based on an experiment with more than 200 human matchers over common benchmarks, that PoWareMatch predicts well the benefit of extending the match with an additional correspondence and generates high quality matches. In addition, PoWareMatch outperforms state-of-the-art matching algorithms.

Related papers

CPRet: A Dataset, Benchmark, and Model for Retrieval in Competitive Programming [56.17331530444765]
CPRet is a retrieval-oriented benchmark suite for competitive programming.<n>It covers four retrieval tasks: two code-centric (i.e., Text-to-Code and Code-to-Code) and two newly proposed problem-centric tasks (i.e., Problem-to-Duplicate and Simplified-to-Full)<n>Our contribution includes both high-quality training data and temporally separated test sets for reliable evaluation.
arXiv Detail & Related papers (2025-05-19T10:07:51Z)
A Best-of-Both Approach to Improve Match Predictions and Reciprocal Recommendations for Job Search [15.585641615174623]
This paper introduces and demonstrates a novel and practical solution to improve reciprocal recommendations in production by leveraging pseudo-match scores. Specifically, our approach generates dense and more directly relevant pseudo-match scores by combining the true match labels with relatively inaccurate but dense match predictions. Our method can be seen as a best-of-both (BoB) approach, as it combines the high-level ideas of both direct match prediction and the two separate models approach.
arXiv Detail & Related papers (2024-09-17T08:51:02Z)
Semisupervised score based matching algorithm to evaluate the effect of public health interventions [3.221788913179251]
In one-to-one matching algorithms, a large number of "pairs" to be matched could mean both the information from a large sample and a large number of tasks. We propose a novel one-to-one matching algorithm based on a quadratic score function $S_beta(x_i,x_j)= betaT (x_i-x_j)(x_i-x_j)T beta$.
arXiv Detail & Related papers (2024-03-19T02:24:16Z)
A Gold Standard Dataset for the Reviewer Assignment Problem [117.59690218507565]
"Similarity score" is a numerical estimate of the expertise of a reviewer in reviewing a paper. Our dataset consists of 477 self-reported expertise scores provided by 58 researchers. For the task of ordering two papers in terms of their relevance for a reviewer, the error rates range from 12%-30% in easy cases to 36%-43% in hard cases.
arXiv Detail & Related papers (2023-03-23T16:15:03Z)
End-to-End Context-Aided Unicity Matching for Person Re-identification [100.02321122258638]
We propose an end-to-end person unicity matching architecture for learning and refining the person matching relations. We use the samples' global context relationship to refine the soft matching results and reach the matching unicity through bipartite graph matching. Given full consideration to real-world person re-identification applications, we achieve the unicity matching in both one-shot and multi-shot settings.
arXiv Detail & Related papers (2022-10-20T07:33:57Z)
Deep Probabilistic Graph Matching [72.6690550634166]
We propose a deep learning-based graph matching framework that works for the original QAP without compromising on the matching constraints. The proposed method is evaluated on three popularly tested benchmarks (Pascal VOC, Willow Object and SPair-71k) and it outperforms all previous state-of-the-arts on all benchmarks.
arXiv Detail & Related papers (2022-01-05T13:37:27Z)
Deep Policies for Online Bipartite Matching: A Reinforcement Learning Approach [5.683591363967851]
We present an end-to-end Reinforcement Learning framework for deriving better matching policies based on trial-and-error on historical data. We show that most of the learning approaches perform significantly better than classical greedy algorithms on four synthetic and real-world datasets.
arXiv Detail & Related papers (2021-09-21T18:04:19Z)
Learning to Characterize Matching Experts [19.246576904646172]
We characterize human matching experts, those humans whose proposed correspondences can mostly be trusted to be valid. We show that our approach can improve matching results by filtering out inexpert matchers.
arXiv Detail & Related papers (2020-12-02T14:16:38Z)
Learning to Match Jobs with Resumes from Sparse Interaction Data using Multi-View Co-Teaching Network [83.64416937454801]
Job-resume interaction data is sparse and noisy, which affects the performance of job-resume match algorithms. We propose a novel multi-view co-teaching network from sparse interaction data for job-resume matching. Our model is able to outperform state-of-the-art methods for job-resume matching.
arXiv Detail & Related papers (2020-09-25T03:09:54Z)
Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data. There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups. We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z)
Learning Preference-Based Similarities from Face Images using Siamese Multi-Task CNNs [78.24964622317633]
Key challenge for online dating platforms is to determine suitable matches for their users. Deep learning approaches have shown that a variety of properties can be predicted from human faces to some degree. We investigate the feasibility of bridging image-based matching and matching with personal interests, preferences, and attitude.
arXiv Detail & Related papers (2020-01-25T23:08:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.