Gradient-Based Join Ordering
- URL: http://arxiv.org/abs/2511.14482v1
- Date: Tue, 18 Nov 2025 13:24:28 GMT
- Title: Gradient-Based Join Ordering
- Authors: Tim Schwabe, Maribel Acosta,
- Abstract summary: Join ordering is the problem of selecting the most efficient sequence in which to evaluate joins in a database query.<n>Traditional approaches cast this problem as a discrete runtime search over binary trees guided by a cost model.<n>We show that, when the cost model is differentiable, the query plans can be continuously relaxed into a soft adjacency matrix.<n>Using a learned Graph Neural Network as the cost model, we demonstrate that this gradient-based approach can find comparable and even lower-cost plans.
- Score: 0.532836690371986
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Join ordering is the NP-hard problem of selecting the most efficient sequence in which to evaluate joins (conjunctive, binary operators) in a database query. As the performance of query execution critically depends on this choice, join ordering lies at the core of query optimization. Traditional approaches cast this problem as a discrete combinatorial search over binary trees guided by a cost model, but they often suffer from high computational complexity and limited scalability. We show that, when the cost model is differentiable, the query plans can be continuously relaxed into a soft adjacency matrix representing a superposition of plans. This continuous relaxation, together with a Gumbel-Softmax parameterization of the adjacency matrix and differentiable constraints enforcing plan validity, enables gradient-based search for plans within this relaxed space. Using a learned Graph Neural Network as the cost model, we demonstrate that this gradient-based approach can find comparable and even lower-cost plans compared to traditional discrete local search methods on two different graph datasets. Furthermore, we empirically show that the runtime of this approach scales linearly with query size, in contrast to quadratic or exponential runtimes of classical approaches. We believe this first step towards gradient-based join ordering can lead to more effective and efficient query optimizers in the future.
Related papers
- Sketched Sum-Product Networks for Joins [0.8999666725996978]
We propose Sum-Product Networks to dynamically approximate sketches on-the-fly.<n>We implement the Fast-AGMS and Bound Sketch methods, which have successfully been used in prior work.
arXiv Detail & Related papers (2025-06-16T22:13:48Z) - GenJoin: Conditional Generative Plan-to-Plan Query Optimizer that Learns from Subplan Hints [1.3108652488669732]
We present GenJoin, a novel learned query that considers the query optimization problem as a symbiotic generative task.<n>GenJoin is the first learned query that significantly and consistently outperforms as well as state-of-the-art methods on two well-known real-world benchmarks.
arXiv Detail & Related papers (2024-11-07T08:31:01Z) - Towards Fast Algorithms for the Preference Consistency Problem Based on Hierarchical Models [4.007697401483925]
We construct and compare algorithmic approaches to solve the Consistency Problem for preference statements based on hierarchical models.
An instance is consistent if there exists an hierarchical model on the evaluation functions that induces an order relation on the alternatives.
We develop three approaches to solve this decision problem.
arXiv Detail & Related papers (2024-10-31T13:48:46Z) - A Benchmark for Maximum Cut: Towards Standardization of the Evaluation of Learned Heuristics for Combinatorial Optimization [12.016449555335976]
We propose an open-source benchmark suite MaxCut-Bench dedicated to the NP-hard Maximum Cut problem in both its weighted and unweighted variants.
We use the benchmark in an attempt to systematically corroborate or reproduce the results of several, popular learning-based approaches.
Our results show that several of the learneds fail to outperform a naive greedy algorithm, and that only one of them consistently outperforms Tabu Search.
arXiv Detail & Related papers (2024-06-14T19:44:23Z) - JoinGym: An Efficient Query Optimization Environment for Reinforcement
Learning [58.71541261221863]
Join order selection (JOS) is the problem of ordering join operations to minimize total query execution cost.
We present JoinGym, a query optimization environment for bushy reinforcement learning (RL)
Under the hood, JoinGym simulates a query plan's cost by looking up intermediate result cardinalities from a pre-computed dataset.
arXiv Detail & Related papers (2023-07-21T17:00:06Z) - Decision Diagram-Based Branch-and-Bound with Caching for Dominance and
Suboptimality Detection [9.175779296469194]
This paper presents new ingredients to speed up the search by exploiting the structure of dynamic programming models.
The key idea is to prevent the repeated expansion of nodes corresponding to the same dynamic programming states by querying expansion thresholds cached throughout the search.
Experiments show that the pruning brought by this caching mechanism allows significantly reducing the number of nodes expanded by the algorithm.
arXiv Detail & Related papers (2022-11-22T10:18:33Z) - Reinforcement Learning Based Query Vertex Ordering Model for Subgraph
Matching [58.39970828272366]
Subgraph matching algorithms enumerate all is embeddings of a query graph in a data graph G.
matching order plays a critical role in time efficiency of these backtracking based subgraph matching algorithms.
In this paper, for the first time we apply the Reinforcement Learning (RL) and Graph Neural Networks (GNNs) techniques to generate the high-quality matching order for subgraph matching algorithms.
arXiv Detail & Related papers (2022-01-25T00:10:03Z) - Deep Probabilistic Graph Matching [72.6690550634166]
We propose a deep learning-based graph matching framework that works for the original QAP without compromising on the matching constraints.
The proposed method is evaluated on three popularly tested benchmarks (Pascal VOC, Willow Object and SPair-71k) and it outperforms all previous state-of-the-arts on all benchmarks.
arXiv Detail & Related papers (2022-01-05T13:37:27Z) - Multidimensional Assignment Problem for multipartite entity resolution [69.48568967931608]
Multipartite entity resolution aims at integrating records from multiple datasets into one entity.
We apply two procedures, a Greedy algorithm and a large scale neighborhood search, to solve the assignment problem.
We find evidence that design-based multi-start can be more efficient as the size of databases grow large.
arXiv Detail & Related papers (2021-12-06T20:34:55Z) - A Bi-Level Framework for Learning to Solve Combinatorial Optimization on
Graphs [91.07247251502564]
We propose a hybrid approach to combine the best of the two worlds, in which a bi-level framework is developed with an upper-level learning method to optimize the graph.
Such a bi-level approach simplifies the learning on the original hard CO and can effectively mitigate the demand for model capacity.
arXiv Detail & Related papers (2021-06-09T09:18:18Z) - Auto-weighted Multi-view Feature Selection with Graph Optimization [90.26124046530319]
We propose a novel unsupervised multi-view feature selection model based on graph learning.
The contributions are threefold: (1) during the feature selection procedure, the consensus similarity graph shared by different views is learned.
Experiments on various datasets demonstrate the superiority of the proposed method compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-04-11T03:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.