Private Aggregate Queries to Untrusted Databases
- URL: http://arxiv.org/abs/2403.13296v1
- Date: Wed, 20 Mar 2024 04:35:21 GMT
- Title: Private Aggregate Queries to Untrusted Databases
- Authors: Syed Mahbub Hafiz, Chitrabhanu Gupta, Warren Wnuck, Brijesh Vora, Chen-Nee Chuah,
- Abstract summary: Private information retrieval (PIR) is a privacy-preserving cryptographic tool.
Most PIR protocols require the client to know the exact row index of the intended database item.
We build a novel information-theoretic PIR framework that permits a user to fetch the aggregated result.
- Score: 3.6209090009720155
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Private information retrieval (PIR), a privacy-preserving cryptographic tool, solves a simplified version of this problem by hiding the database item that a client accesses. Most PIR protocols require the client to know the exact row index of the intended database item, which cannot support the complicated aggregation-based statistical query in a similar setting. Some works in the PIR space contain keyword searching and SQL-like queries, but most need multiple interactions between the PIR client and PIR servers. Some schemes support searching SQL-like expressive queries in a single round but fail to enable aggregate queries. These schemes are the main focus of this paper. To bridge the gap, we have built a general-purpose novel information-theoretic PIR (IT-PIR) framework that permits a user to fetch the aggregated result, hiding all sensitive sections of the complex query from the hosting PIR server in a single round of interaction. In other words, the server will not know which records contribute to the aggregation. We then evaluate the feasibility of our protocol for both benchmarking and real-world application settings. For instance, in a complex aggregate query to the Twitter microblogging database of 1 million tweets, our protocol takes 0.014 seconds for a PIR server to generate the result when the user is interested in one of 3K user handles. In contrast, for a much-simplified task, not an aggregate but a positional query, Goldberg's regular IT-PIR (Oakland 2007) takes 1.13 seconds. For all possible user handles, 300K, it takes equal time compared to the regular IT-PIR. This example shows that complicated aggregate queries through our framework do not incur additional overhead if not less, compared to the conventional query.
Related papers
- Compliance Brain Assistant: Conversational Agentic AI for Assisting Compliance Tasks in Enterprise Environments [2.8724171056550256]
Compliance Brain Assistant (CBA) is a conversational, agentic AI assistant designed to boost the efficiency of daily compliance tasks for personnel in enterprise environments.<n>To strike a good balance between response quality and latency, we design a user query router that can intelligently choose between FastTrack mode and FullAgentic mode.
arXiv Detail & Related papers (2025-07-23T07:51:10Z) - QueryBuilder: Human-in-the-Loop Query Development for Information Retrieval [12.543590253664492]
We present a novel, interactive system called $textitQueryBuilder$.
It allows a novice, English-speaking user to create queries with a small amount of effort.
It rapidly develops cross-lingual information retrieval queries corresponding to the user's information needs.
arXiv Detail & Related papers (2024-09-07T00:46:58Z) - Scalable Private Search with Wally [27.387418704955994]
Wally is a private search system that supports efficient semantic and keyword search queries against large databases.
In previous private search systems, for each client query, the server must perform at least one expensive cryptographic operation per database entry.
We achieve these results by requiring each client to add a few fake queries and send each query via an anonymous network to the server.
arXiv Detail & Related papers (2024-06-10T19:41:25Z) - Context-Aware Query Rewriting for Improving Users' Search Experience on
E-commerce Websites [47.04727122209316]
E-commerce queries are often short and ambiguous.
Users tend to enter multiple searches, which we call context, before purchasing.
We propose an end-to-end context-aware query rewriting model.
arXiv Detail & Related papers (2022-09-15T19:46:01Z) - Query2Particles: Knowledge Graph Reasoning with Particle Embeddings [49.64006979045662]
We propose a query embedding method to answer complex logical queries on knowledge graphs with missing edges.
The answer entities are selected according to the similarities between the entity embeddings and the query embedding.
A complex KG query answering method, Q2P, is proposed to retrieve diverse answers from different areas over the embedding space.
arXiv Detail & Related papers (2022-04-27T11:16:08Z) - Graph Enhanced BERT for Query Understanding [55.90334539898102]
query understanding plays a key role in exploring users' search intents and facilitating users to locate their most desired information.
In recent years, pre-trained language models (PLMs) have advanced various natural language processing tasks.
We propose a novel graph-enhanced pre-training framework, GE-BERT, which can leverage both query content and the query graph.
arXiv Detail & Related papers (2022-04-03T16:50:30Z) - Parallel Instance Query Network for Named Entity Recognition [73.30174490672647]
Named entity recognition (NER) is a fundamental task in natural language processing.
Recent works treat named entity recognition as a reading comprehension task, constructing type-specific queries manually to extract entities.
We propose Parallel Instance Query Network (PIQN), which sets up global and learnable instance queries to extract entities in a parallel manner.
arXiv Detail & Related papers (2022-03-20T13:01:25Z) - How to Query An Oracle? Efficient Strategies to Label Data [59.89900843097016]
We consider the basic problem of querying an expert oracle for labeling a dataset in machine learning.
We present a randomized batch algorithm that operates on a round-by-round basis to label the samples and achieves a query rate of $O(fracNk2)$.
In addition, we present an adaptive greedy query scheme, which achieves an average rate of $approx 0.2N$ queries per sample with triplet queries.
arXiv Detail & Related papers (2021-10-05T20:15:35Z) - Counting Query Answers over a DL-Lite Knowledge Base (extended version) [14.504450881786214]
We study the data complexity of query answering over a Knowledge Base (KB)
We improve upon existing results by providing a PTIME and coNP lower bounds, and upper bounds in PTIME and LOGSPACE.
arXiv Detail & Related papers (2020-05-12T16:01:09Z) - Query2box: Reasoning over Knowledge Graphs in Vector Space using Box
Embeddings [84.0206612938464]
query2box is an embedding-based framework for reasoning over arbitrary queries on incomplete knowledge graphs.
We show that query2box is capable of handling arbitrary logical queries with $wedge$, $vee$, $exists$ in a scalable manner.
We demonstrate the effectiveness of query2box on three large KGs and show that query2box achieves up to 25% relative improvement over the state of the art.
arXiv Detail & Related papers (2020-02-14T11:20:10Z) - Don't Parse, Generate! A Sequence to Sequence Architecture for
Task-Oriented Semantic Parsing [0.0]
Virtual assistants such as Amazon Alexa, Apple Siri, and Google Assistant often rely on a semantic parsing component to understand which action(s) to execute for an utterance spoken by its users.
We propose a unified architecture based on Sequence to Sequence models and Pointer Generator Network to handle both simple and complex queries.
arXiv Detail & Related papers (2020-01-30T17:11:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.