User-friendly Comparison of Similarity Algorithms on Wikidata
- URL: http://arxiv.org/abs/2108.05410v1
- Date: Wed, 11 Aug 2021 18:59:25 GMT
- Title: User-friendly Comparison of Similarity Algorithms on Wikidata
- Authors: Filip Ilievski and Pedro Szekely and Gleb Satyukov and Amandeep Singh
- Abstract summary: We present a user-friendly interface that allows flexible computation of similarity between Qnodes in Wikidata.
At present, the similarity interface supports four algorithms, based on: graph embeddings (TransE, ComplEx), text embeddings (BERT) and class-based similarity.
We also provide a REST API that can compute most similar neighbors for any Qnode in Wikidata.
- Score: 2.8551587610394904
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While the similarity between two concept words has been evaluated and studied
for decades, much less attention has been devoted to algorithms that can
compute the similarity of nodes in very large knowledge graphs, like Wikidata.
To facilitate investigations and head-to-head comparisons of similarity
algorithms on Wikidata, we present a user-friendly interface that allows
flexible computation of similarity between Qnodes in Wikidata. At present, the
similarity interface supports four algorithms, based on: graph embeddings
(TransE, ComplEx), text embeddings (BERT), and class-based similarity. We
demonstrate the behavior of the algorithms on representative examples about
semantically similar, related, and entirely unrelated entity pairs. To support
anticipated applications that require efficient similarity computations, like
entity linking and recommendation, we also provide a REST API that can compute
most similar neighbors for any Qnode in Wikidata.
Related papers
- SiReRAG: Indexing Similar and Related Information for Multihop Reasoning [96.60045548116584]
SiReRAG is a novel RAG indexing approach that explicitly considers both similar and related information.
SiReRAG consistently outperforms state-of-the-art indexing methods on three multihop datasets.
arXiv Detail & Related papers (2024-12-09T04:56:43Z) - Measuring similarity between embedding spaces using induced neighborhood graphs [10.056989400384772]
We propose a metric to evaluate the similarity between paired item representations.
Our results show that accuracy in both analogy and zero-shot classification tasks correlates with the embedding similarity.
arXiv Detail & Related papers (2024-11-13T15:22:33Z) - A general framework for distributed approximate similarity search with arbitrary distances [0.5030361857850012]
Similarity search is a central problem in domains such as information management and retrieval or data analysis.
Many similarity search algorithms are designed or specifically adapted to metric distances.
This paper presents GDASC, a general framework for distributed approximate similarity search that accepts arbitrary distances.
arXiv Detail & Related papers (2024-05-22T16:19:52Z) - Synthetic Datasets for Program Similarity Research [39.91303506884272]
HELIX is a framework for generating large, synthetic program similarity datasets.
Blind HELIX is a tool built on top of HELIX for extracting HELIX components from library code automatically using program slicing.
arXiv Detail & Related papers (2024-05-06T13:52:02Z) - Comparing Personalized Relevance Algorithms for Directed Graphs [0.34952465649465553]
We present an interactive Web platform that, given a directed graph, allows identifying the most relevant nodes related to a given query node.
We provide 50 pre-loaded datasets from Wikipedia, Twitter, and Amazon and seven algorithms.
Our tool helps to uncover hidden relationships within the data, which makes of it a valuable addition to the repertoire of graph analysis algorithms.
arXiv Detail & Related papers (2024-05-03T17:24:08Z) - Attributable Visual Similarity Learning [90.69718495533144]
This paper proposes an attributable visual similarity learning (AVSL) framework for a more accurate and explainable similarity measure between images.
Motivated by the human semantic similarity cognition, we propose a generalized similarity learning paradigm to represent the similarity between two images with a graph.
Experiments on the CUB-200-2011, Cars196, and Stanford Online Products datasets demonstrate significant improvements over existing deep similarity learning methods.
arXiv Detail & Related papers (2022-03-28T17:35:31Z) - Improving Candidate Retrieval with Entity Profile Generation for
Wikidata Entity Linking [76.00737707718795]
We propose a novel candidate retrieval paradigm based on entity profiling.
We use the profile to query the indexed search engine to retrieve candidate entities.
Our approach complements the traditional approach of using a Wikipedia anchor-text dictionary.
arXiv Detail & Related papers (2022-02-27T17:38:53Z) - Applying Transfer Learning for Improving Domain-Specific Search
Experience Using Query to Question Similarity [0.0]
We discuss a framework for calculating similarities between a given input query and a set of predefined questions to retrieve the question which matches to it the most.
We have used it for the financial domain, but the framework is generalized for any domain-specific search engine and can be used in other domains as well.
arXiv Detail & Related papers (2021-01-07T03:27:32Z) - Graph Structured Network for Image-Text Matching [127.68148793548116]
We present a novel Graph Structured Matching Network to learn fine-grained correspondence.
The GSMN explicitly models object, relation and attribute as a structured phrase.
Experiments show that GSMN outperforms state-of-the-art methods on benchmarks.
arXiv Detail & Related papers (2020-04-01T08:20:42Z) - LSF-Join: Locality Sensitive Filtering for Distributed All-Pairs Set
Similarity Under Skew [58.21885402826496]
All-pairs set similarity is a widely used data mining task, even for large and high-dimensional datasets.
We present a new distributed algorithm, LSF-Join, for approximate all-pairs set similarity.
We show that LSF-Join efficiently finds most close pairs, even for small similarity thresholds and for skewed input sets.
arXiv Detail & Related papers (2020-03-06T00:06:20Z) - Learning Attentive Pairwise Interaction for Fine-Grained Classification [53.66543841939087]
We propose a simple but effective Attentive Pairwise Interaction Network (API-Net) for fine-grained classification.
API-Net first learns a mutual feature vector to capture semantic differences in the input pair.
It then compares this mutual vector with individual vectors to generate gates for each input image.
We conduct extensive experiments on five popular benchmarks in fine-grained classification.
arXiv Detail & Related papers (2020-02-24T12:17:56Z) - BasConv: Aggregating Heterogeneous Interactions for Basket
Recommendation with Graph Convolutional Neural Network [64.73281115977576]
Within-basket recommendation reduces the exploration time of users, where the user's intention of the basket matters.
We propose a new framework named textbfBasConv, which is based on the graph convolutional neural network.
Our BasConv model has three types of aggregators specifically designed for three types of nodes.
arXiv Detail & Related papers (2020-01-14T16:27:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.