Related papers: An Empirical Evaluation of Cost-based Federated SPARQL Query Processing Engines

An Empirical Evaluation of Cost-based Federated SPARQL Query Processing Engines

URL: http://arxiv.org/abs/2104.00984v1
Date: Fri, 2 Apr 2021 11:01:25 GMT
Title: An Empirical Evaluation of Cost-based Federated SPARQL Query Processing Engines
Authors: Umair Qudus, Muhammad Saleem, Axel-Cyrille Ngonga Ngomo, Young-koo Lee
Abstract summary: We present novel evaluation metrics targeted at a fine-grained benchmarking of cost-based federated SPARQL query engines. We evaluate five cost-based federated SPARQL query engines using existing as well as novel evaluation metrics by using LargeRDFBench queries.
Score: 4.760079434948197
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Finding a good query plan is key to the optimization of query runtime. This holds in particular for cost-based federation engines, which make use of cardinality estimations to achieve this goal. A number of studies compare SPARQL federation engines across different performance metrics, including query runtime, result set completeness and correctness, number of sources selected and number of requests sent. Albeit informative, these metrics are generic and unable to quantify and evaluate the accuracy of the cardinality estimators of cost-based federation engines. To thoroughly evaluate cost-based federation engines, the effect of estimated cardinality errors on the overall query runtime performance must be measured. In this paper, we address this challenge by presenting novel evaluation metrics targeted at a fine-grained benchmarking of cost-based federated SPARQL query engines. We evaluate five cost-based federated SPARQL query engines using existing as well as novel evaluation metrics by using LargeRDFBench queries. Our results provide a detailed analysis of the experimental outcomes that reveal novel insights, useful for the development of future cost-based federated SPARQL query processing engines.

Related papers

ACE: A Cardinality Estimator for Set-Valued Queries [35.31790118566289]
We propose an Attention-based Cardinality Estorimator for estimating the cardinality of queries over set-valued data. To handle variable-sized queries, a pooling module is introduced, followed by a regression model (MLP) to generate final cardinality estimates.
arXiv Detail & Related papers (2025-03-19T06:29:15Z)
CONCERTO: Complex Query Execution Mechanism-Aware Learned Cost Estimation [8.024724736461328]
This paper proposes CONCERTO, a complex query executiON meChanism-awaE leaRned cosT estimatiOn method. ConCERTO first establishes independent resource cost models for each physical operator. It then constructs a Directed Acyclic Graph (DAG) consisting of a dataflow tree backbone and resource competition relationships among concurrent operators.
arXiv Detail & Related papers (2024-12-01T09:58:54Z)
CROSS-JEM: Accurate and Efficient Cross-encoders for Short-text Ranking Tasks [12.045202648316678]
Transformer-based ranking models are the state-of-the-art approaches for such tasks. We propose Cross-encoders with Joint Efficient Modeling (CROSS-JEM) CROSS-JEM enables transformer-based models to jointly score multiple items for a query. It achieves state-of-the-art accuracy and over 4x lower ranking latency over standard cross-encoders.
arXiv Detail & Related papers (2024-09-15T17:05:35Z)
UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics. We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z)
Budget-aware Query Tuning: An AutoML Perspective [14.561951257365953]
Modern database systems rely on cost-based querys to come up with good execution plans for input queries. We show that by varying the costunit values one can obtain query plans that significantly outperform the default query plans.
arXiv Detail & Related papers (2024-03-29T20:19:36Z)
Roq: Robust Query Optimization Based on a Risk-aware Learned Cost Model [3.0784574277021406]
We propose a holistic framework that enables robust query optimization based on a risk-aware learning approach. Roq includes a novel formalization of the notion of robustness in the context of query optimization. We demonstrate experimentally that Roq provides significant improvements to robust query optimization compared to the state-of-the-art.
arXiv Detail & Related papers (2024-01-26T21:16:37Z)
JoinGym: An Efficient Query Optimization Environment for Reinforcement Learning [58.71541261221863]
Join order selection (JOS) is the problem of ordering join operations to minimize total query execution cost. We present JoinGym, a query optimization environment for bushy reinforcement learning (RL) Under the hood, JoinGym simulates a query plan's cost by looking up intermediate result cardinalities from a pre-computed dataset.
arXiv Detail & Related papers (2023-07-21T17:00:06Z)
Improving Text Matching in E-Commerce Search with A Rationalizable, Intervenable and Fast Entity-Based Relevance Model [78.80174696043021]
We propose a novel model called the Entity-Based Relevance Model (EBRM) The decomposition allows us to use a Cross-encoder QE relevance module for high accuracy. We also show that pretraining the QE module with auto-generated QE data from user logs can further improve the overall performance.
arXiv Detail & Related papers (2023-07-01T15:44:53Z)
Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs [66.30706841821123]
Large language models (LLMs) power many state-of-the-art systems in natural language processing. LLMs are extremely computationally expensive, even at inference time. We propose a new metric for comparing inference efficiency across models.
arXiv Detail & Related papers (2023-05-03T21:51:42Z)
NeuralSearchX: Serving a Multi-billion-parameter Reranker for Multilingual Metasearch at a Low Cost [4.186775801993103]
We describe NeuralSearchX, a metasearch engine based on a multi-purpose large reranking model to merge results and highlight sentences. We show that our design choices led to a much cost-effective system with competitive QPS while having close to state-of-the-art results on a wide range of public benchmarks.
arXiv Detail & Related papers (2022-10-26T16:36:53Z)
Learning GraphQL Query Costs (Extended Version) [7.899264246319001]
We propose a machine-learning approach to efficiently and accurately estimate the query cost. Our framework is efficient and predicts query costs with high accuracy, consistently outperforming the static analysis by a large margin.
arXiv Detail & Related papers (2021-08-25T09:18:31Z)
Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs [62.71505254770827]
We propose a conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts. Our model obtains impressive performance gains over all baselines on both tasks, using only a fraction of data for training.
arXiv Detail & Related papers (2020-05-28T08:26:06Z)
Query Focused Multi-Document Summarization with Distant Supervision [88.39032981994535]
Existing work relies heavily on retrieval-style methods for estimating the relevance between queries and text segments. We propose a coarse-to-fine modeling framework which introduces separate modules for estimating whether segments are relevant to the query. We demonstrate that our framework outperforms strong comparison systems on standard QFS benchmarks.
arXiv Detail & Related papers (2020-04-06T22:35:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.