VeriFuzzy: A Dynamic Verifiable Fuzzy Search Service for Encrypted Cloud Data
- URL: http://arxiv.org/abs/2507.10927v2
- Date: Sun, 28 Sep 2025 15:45:47 GMT
- Title: VeriFuzzy: A Dynamic Verifiable Fuzzy Search Service for Encrypted Cloud Data
- Authors: Jie Zhang, Xiaohong Li, Man Zheng, Ruitao Feng, Shanshan Xu, Zhe Hou, Guangdong Bai,
- Abstract summary: Service that supports dynamic, verifiable fuzzy search (DVFS) over encrypted cloud data remains a fundamental challenge.<n>This paper presents textbfVeriFuzzy, a novel DVFS service framework that cohesively integrates three innovations.<n>Our code and dataset are now open source, hoping to inspire future DVFS research.
- Score: 13.863905835870836
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Enabling search over encrypted cloud data is essential for privacy-preserving data outsourcing. While searchable encryption has evolved to support individual requirements like fuzzy matching, dynamic updates, and result verification, designing a service that supports dynamic, verifiable fuzzy search (DVFS) over encrypted cloud data remains a fundamental challenge due to inherent conflicts between underlying technologies. Existing approaches struggle with simultaneously achieving efficiency, functionality, and security, often forcing impractical trade-offs. This paper presents \textbf{VeriFuzzy}, a novel DVFS service framework that cohesively integrates three innovations: an \textit{Enhanced Virtual Binary Tree (EVBTree)} that decouples fuzzy semantics from index logic to support $O(\log n)$ search/updates; a \textit{blockchain-reconstructed verification} mechanism that ensures result integrity with logarithmic complexity; and a \textit{dual-repository state management} scheme that achieves IND-CKA2 security by neutralizing branch leakage. Extensive evaluation on 3,500+ documents shows VeriFuzzy achieves 41\% faster search, $5\times$ more efficient verification, and constant-time index updates compared to state-of-the-art alternatives. Our code and dataset are now open source, hoping to inspire future DVFS research.
Related papers
- FS-Researcher: Test-Time Scaling for Long-Horizon Research Tasks with File-System-Based Agents [53.03492387564392]
We introduce FS-Researcher, a file-system-based framework that scales deep research beyond the context window via a persistent workspace.<n>A Context Builder agent browses the internet, writes structured notes, and archives raw sources into a hierarchical knowledge base that can grow far beyond context length.<n>A Report Writer agent then composes the final report section by section, treating the knowledge base as the source of facts.
arXiv Detail & Related papers (2026-02-02T03:00:19Z) - Towards Privacy-Preserving Range Queries with Secure Learned Spatial Index over Encrypted Data [8.495233108444202]
We propose a novel privacy-preserving range query scheme over encrypted datasets.<n>SLS-INDEX integrates the Paillier cryptosystem with a hierarchical prediction architecture and noise-injected buckets.<n> SLRQ significantly outperforms existing solutions in query efficiency while ensuring dataset, query, result, and access pattern privacy.
arXiv Detail & Related papers (2025-12-03T10:59:40Z) - ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search [69.60882125603133]
We present ReliabilityRAG, a framework for adversarial robustness that explicitly leverages reliability information of retrieved documents.<n>Our work is a significant step towards more effective, provably robust defenses against retrieved corpus corruption in RAG.
arXiv Detail & Related papers (2025-09-27T22:36:42Z) - Privacy-Preserving Anonymization of System and Network Event Logs Using Salt-Based Hashing and Temporal Noise [5.85293491327449]
Event logs contain Personally Identifiable Information (PII)<n>Overly aggressive anonymization can destroy contextual integrity, while weak techniques risk re-identification through linkage or inference attacks.<n>This paper introduces novel field-specific anonymization methods that address this trade-off.
arXiv Detail & Related papers (2025-07-29T15:16:42Z) - Threshold-Protected Searchable Sharing: Privacy Preserving Aggregated-ANN Search for Collaborative RAG [0.0]
Two key bottlenecks are private data repositories' locality constraints and the need to maintain compatibility with mainstream search techniques.<n>We develop a secure and privacy-preserving aggregated approximate nearest neighbor search (SP-A$2$NN) with HNSW compatibility.<n>We also explore a novel security analytical framework that incorporates privacy analysis via reductions.
arXiv Detail & Related papers (2025-07-23T04:45:01Z) - AI-Based Vulnerability Analysis of NFT Smart Contracts [6.378351117969227]
This study proposes an AI-driven approach to detect vulnerabilities in NFT smart contracts.<n>We collected 16,527 public smart contract codes, classifying them into five vulnerability categories: Risky Mutable Proxy, ERC-721 Reentrancy, Unlimited Minting, Missing Requirements, and Public Burn.<n>A random forest model was implemented to improve robustness through random data/feature sampling and multitree integration.
arXiv Detail & Related papers (2025-04-18T08:55:31Z) - Verifiable, Efficient and Confidentiality-Preserving Graph Search with Transparency [16.64649629947436]
PeGraph is the latest scheme achieving encrypted search over social graphs to address the privacy leakage.<n>It does not provide transparent search capabilities, suffers from expensive computation and result pattern leakages.<n>We propose SecGraph to address the first two limitations, which adopts a novel system architecture.
arXiv Detail & Related papers (2025-03-13T08:53:53Z) - Optimal Oblivious Algorithms for Multi-way Joins [2.8151472703172398]
We propose a novel sorting-based algorithm for multi-way join processing that operates without relying on ORAM simulations or other security assumptions.<n>Our algorithm is a non-trivial, provably oblivious composition of basic primitives, with time complexity matching the insecure worst-case optimal join algorithm, up to a logarithmic factor.
arXiv Detail & Related papers (2025-01-08T01:23:29Z) - InputSnatch: Stealing Input in LLM Services via Timing Side-Channel Attacks [9.748438507132207]
Large language models (LLMs) possess extensive knowledge and question-answering capabilities.<n> cache-sharing methods are commonly employed to enhance efficiency by reusing cached states or responses for the same or similar inference requests.<n>We propose a novel timing-based side-channel attack to execute input theft in LLMs inference.
arXiv Detail & Related papers (2024-11-27T10:14:38Z) - HOPE: Homomorphic Order-Preserving Encryption for Outsourced Databases -- A Stateless Approach [1.1701842638497677]
Homomorphic OPE (HOPE) is a new OPE scheme that eliminates client-side storage and avoids additional client-server interaction during query execution.
We provide a formal cryptographic analysis of HOPE, proving its security under the widely accepted IND-OCPA model.
arXiv Detail & Related papers (2024-11-26T00:38:46Z) - FRAG: Toward Federated Vector Database Management for Collaborative and Secure Retrieval-Augmented Generation [1.3824176915623292]
This paper introduces textitFederated Retrieval-Augmented Generation (FRAG), a novel database management paradigm tailored for the growing needs of retrieval-augmented generation (RAG) systems.
FRAG enables mutually-distrusted parties to collaboratively perform Approximate $k$-Nearest Neighbor (ANN) searches on encrypted query vectors and encrypted data stored in distributed vector databases.
arXiv Detail & Related papers (2024-10-17T06:57:29Z) - PriRoAgg: Achieving Robust Model Aggregation with Minimum Privacy Leakage for Federated Learning [49.916365792036636]
Federated learning (FL) has recently gained significant momentum due to its potential to leverage large-scale distributed user data.<n>The transmitted model updates can potentially leak sensitive user information, and the lack of central control of the local training process leaves the global model susceptible to malicious manipulations on model updates.<n>We develop a general framework PriRoAgg, utilizing Lagrange coded computing and distributed zero-knowledge proof, to execute a wide range of robust aggregation algorithms while satisfying aggregated privacy.
arXiv Detail & Related papers (2024-07-12T03:18:08Z) - Digital Twin-Assisted Data-Driven Optimization for Reliable Edge Caching in Wireless Networks [60.54852710216738]
We introduce a novel digital twin-assisted optimization framework, called D-REC, to ensure reliable caching in nextG wireless networks.
By incorporating reliability modules into a constrained decision process, D-REC can adaptively adjust actions, rewards, and states to comply with advantageous constraints.
arXiv Detail & Related papers (2024-06-29T02:40:28Z) - d-DSE: Distinct Dynamic Searchable Encryption Resisting Volume Leakage in Encrypted Databases [24.259108931623203]
Dynamic Searchable Encryption (DSE) has emerged as a solution to efficiently handle and protect large-scale data storage in encrypted databases (EDBs)
Volume leakage poses a significant threat, as it enables adversaries to reconstruct search queries and potentially compromise the security and privacy of data.
Padding strategies are common countermeasures for the leakage, but they significantly increase storage and communication costs.
arXiv Detail & Related papers (2024-03-02T11:42:17Z) - TernaryVote: Differentially Private, Communication Efficient, and
Byzantine Resilient Distributed Optimization on Heterogeneous Data [50.797729676285876]
We propose TernaryVote, which combines a ternary compressor and the majority vote mechanism to realize differential privacy, gradient compression, and Byzantine resilience simultaneously.
We theoretically quantify the privacy guarantee through the lens of the emerging f-differential privacy (DP) and the Byzantine resilience of the proposed algorithm.
arXiv Detail & Related papers (2024-02-16T16:41:14Z) - Breaking the Communication-Privacy-Accuracy Tradeoff with
$f$-Differential Privacy [51.11280118806893]
We consider a federated data analytics problem in which a server coordinates the collaborative data analysis of multiple users with privacy concerns and limited communication capability.
We study the local differential privacy guarantees of discrete-valued mechanisms with finite output space through the lens of $f$-differential privacy (DP)
More specifically, we advance the existing literature by deriving tight $f$-DP guarantees for a variety of discrete-valued mechanisms.
arXiv Detail & Related papers (2023-02-19T16:58:53Z) - $\eta$-DARTS++: Bi-level Regularization for Proxy-robust Differentiable
Architecture Search [96.99525100285084]
Regularization method, Beta-Decay, is proposed to regularize the DARTS-based NAS searching process (i.e., $beta$-DARTS)
In-depth theoretical analyses on how it works and why it works are provided.
arXiv Detail & Related papers (2023-01-16T12:30:32Z) - Log Barriers for Safe Black-box Optimization with Application to Safe
Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial.
Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size.
We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z) - Autoregressive Search Engines: Generating Substrings as Document
Identifiers [53.0729058170278]
Autoregressive language models are emerging as the de-facto standard for generating answers.
Previous work has explored ways to partition the search space into hierarchical structures.
In this work we propose an alternative that doesn't force any structure in the search space: using all ngrams in a passage as its possible identifiers.
arXiv Detail & Related papers (2022-04-22T10:45:01Z) - GERE: Generative Evidence Retrieval for Fact Verification [57.78768817972026]
We propose GERE, the first system that retrieves evidences in a generative fashion.
The experimental results on the FEVER dataset show that GERE achieves significant improvements over the state-of-the-art baselines.
arXiv Detail & Related papers (2022-04-12T03:49:35Z) - $\eta$-DARTS: Beta-Decay Regularization for Differentiable Architecture
Search [85.84110365657455]
We propose a simple-but-efficient regularization method, termed as Beta-Decay, to regularize the DARTS-based NAS searching process.
Experimental results on NAS-Bench-201 show that our proposed method can help to stabilize the searching process and makes the searched network more transferable across different datasets.
arXiv Detail & Related papers (2022-03-03T11:47:14Z) - Secure Bilevel Asynchronous Vertical Federated Learning with Backward
Updating [159.48259714642447]
Vertical scalable learning (VFL) attracts increasing attention due to the demands of multi-party collaborative modeling and concerns of privacy leakage.
We propose a novel bftextlevel parallel architecture (VF$bfB2$), under which three new algorithms, including VF$B2$, are proposed.
arXiv Detail & Related papers (2021-03-01T12:34:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.