Optimal Oblivious Algorithms for Multi-way Joins
- URL: http://arxiv.org/abs/2501.04216v2
- Date: Thu, 09 Jan 2025 03:02:31 GMT
- Title: Optimal Oblivious Algorithms for Multi-way Joins
- Authors: Xiao Hu, Zhiang Wu,
- Abstract summary: We propose a novel sorting-based algorithm for multi-way join processing that operates without relying on ORAM simulations or other security assumptions.
Our algorithm is a non-trivial, provably oblivious composition of basic primitives, with time complexity matching the insecure worst-case optimal join algorithm, up to a logarithmic factor.
- Score: 2.8151472703172398
- License:
- Abstract: In cloud databases, cloud computation over sensitive data uploaded by clients inevitably causes concern about data security and privacy. Even when encryption primitives and trusted computing environments are integrated into query processing to safeguard the actual contents of the data, access patterns of algorithms can still leak private information about the data. Oblivious Random Access Memory (ORAM) and circuits are two generic approaches to address this issue, ensuring that access patterns of algorithms remain oblivious to the data. However, deploying these methods on insecure algorithms, particularly for multi-way join processing, is computationally expensive and inherently challenging. In this paper, we propose a novel sorting-based algorithm for multi-way join processing that operates without relying on ORAM simulations or other security assumptions. Our algorithm is a non-trivial, provably oblivious composition of basic primitives, with time complexity matching the insecure worst-case optimal join algorithm, up to a logarithmic factor. Furthermore, it is cache-agnostic, with cache complexity matching the insecure lower bound, also up to a logarithmic factor. This clean and straightforward approach has the potential to be extended to other security settings and implemented in practical database systems.
Related papers
- Sublinear-Overhead Secure Linear Algebra on a Dishonest Server [3.8105803634609483]
We state the natural efficiency and security desiderata for fast, remote, and data-oblivious linear algebra.
We conjecture the existence of matrix and vector families implying satisfactory algorithms, and provide such an algorithm contingent on common cryptographic assumptions.
arXiv Detail & Related papers (2025-02-18T17:05:17Z) - Scalable Private Partition Selection via Adaptive Weighting [66.09199304818928]
In a private set union, users hold subsets of items from an unbounded universe.
The goal is to output as many items as possible from the union of the users' sets while maintaining user-level differential privacy.
We propose an algorithm for this problem, MaximumDegree (MAD), which adaptively reroutes weight from items with weight far above the threshold needed for privacy to items with smaller weight.
arXiv Detail & Related papers (2025-02-13T01:27:11Z) - Encrypted system identification as-a-service via reliable encrypted matrix inversion [0.0]
Encrypted computation opens up promising avenues across a plethora of application domains.
In particular, Arithmetic homomorphic encryption is a natural fit for cloud-based computational services.
This paper presents an encrypted system identification service enabled by a reliable encrypted solution to at least squares problems.
arXiv Detail & Related papers (2024-10-27T20:00:04Z) - Encoding of data sets and algorithms [0.0]
In many high-impact applications, it is important to ensure the quality of output of a machine learning algorithm.
We have initiated a mathematically rigorous theory to decide which models are close to each other in terms of certain metrics.
A given threshold metric acting on this grid will express the nearness (or statistical distance) from each algorithm and data set of interest to any given application.
arXiv Detail & Related papers (2023-03-02T05:29:27Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - New Oracle-Efficient Algorithms for Private Synthetic Data Release [52.33506193761153]
We present three new algorithms for constructing differentially private synthetic data.
The algorithms satisfy differential privacy even in the worst case.
Compared to the state-of-the-art method High-Dimensional Matrix Mechanism citeMcKennaMHM18, our algorithms provide better accuracy in the large workload.
arXiv Detail & Related papers (2020-07-10T15:46:05Z) - Run2Survive: A Decision-theoretic Approach to Algorithm Selection based
on Survival Analysis [75.64261155172856]
survival analysis (SA) naturally supports censored data and offers appropriate ways to use such data for learning distributional models of algorithm runtime.
We leverage such models as a basis of a sophisticated decision-theoretic approach to algorithm selection, which we dub Run2Survive.
In an extensive experimental study with the standard benchmark ASlib, our approach is shown to be highly competitive and in many cases even superior to state-of-the-art AS approaches.
arXiv Detail & Related papers (2020-07-06T15:20:17Z) - Faster Secure Data Mining via Distributed Homomorphic Encryption [108.77460689459247]
Homomorphic Encryption (HE) is receiving more and more attention recently for its capability to do computations over the encrypted field.
We propose a novel general distributed HE-based data mining framework towards one step of solving the scaling problem.
We verify the efficiency and effectiveness of our new framework by testing over various data mining algorithms and benchmark data-sets.
arXiv Detail & Related papers (2020-06-17T18:14:30Z) - Safeguarded Learned Convex Optimization [106.81731132086851]
Analytic optimization algorithms can be hand-designed to provably solve problems in an iterative fashion.
Data-driven algorithms can "learn to optimize" (L2O) with much fewer iterations and similar cost per iteration as general-purpose optimization algorithms.
We present a Safe-L2O framework to fuse the advantages of these approaches.
arXiv Detail & Related papers (2020-03-04T04:01:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.