Leaking Queries On Secure Stream Processing Systems
- URL: http://arxiv.org/abs/2510.12172v2
- Date: Wed, 15 Oct 2025 02:48:29 GMT
- Title: Leaking Queries On Secure Stream Processing Systems
- Authors: Hung Pham, Viet Vo, Tien Tuan Anh Dinh, Duc Tran, Shuhao Zhang,
- Abstract summary: In stream processing systems, queries are as sensitive as the data because they contain the application logics.<n>We demonstrate that it is practical to extract the queries from stream processing systems that use Intel SGX for securing the execution engine.<n>We implement the attack based on popular data stream benchmarks using SecureStream and NEXMark, and demonstrate attack success rates of up to 92%.
- Score: 4.237462173012957
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Stream processing systems are important in modern applications in which data arrive continuously and need to be processed in real time. Because of their resource and scalability requirements, many of these systems run on the cloud, which is considered untrusted. Existing works on securing databases on the cloud focus on protecting the data, and most systems leverage trusted hardware for high performance. However, in stream processing systems, queries are as sensitive as the data because they contain the application logics. We demonstrate that it is practical to extract the queries from stream processing systems that use Intel SGX for securing the execution engine. The attack performed by a malicious cloud provider is based on timing side channels, and it works in two phases. In the offline phase, the attacker profiles the execution time of individual stream operators, based on synthetic data. This phase outputs a model that identifies individual stream operators. In the online phase, the attacker isolates the operators that make up the query, monitors its execution, and recovers the operators using the model in the previous phase. We implement the attack based on popular data stream benchmarks using SecureStream and NEXMark, and demonstrate attack success rates of up to 92%. We further discuss approaches that can harden streaming processing systems against our attacks without incurring high overhead.
Related papers
- Detecting Object Tracking Failure via Sequential Hypothesis Testing [80.7891291021747]
Real-time online object tracking in videos constitutes a core task in computer vision.<n>We propose interpreting object tracking as a sequential hypothesis test, wherein evidence for or against tracking failures is gradually accumulated over time.<n>We propose both supervised and unsupervised variants by leveraging either ground-truth or solely internal tracking information.
arXiv Detail & Related papers (2026-02-13T14:57:15Z) - ReasAlign: Reasoning Enhanced Safety Alignment against Prompt Injection Attack [52.17935054046577]
We present ReasAlign, a model-level solution to improve safety alignment against indirect prompt injection attacks.<n>ReasAlign incorporates structured reasoning steps to analyze user queries, detect conflicting instructions, and preserve the continuity of the user's intended tasks.
arXiv Detail & Related papers (2026-01-15T08:23:38Z) - Operational Runtime Behavior Mining for Open-Source Supply Chain Security [2.9552238960255255]
HeteroGAT-Rank is an industry-oriented runtime behavior mining system.<n>It surfaces actionable runtime signals to guide manual investigation and threat hunting.<n>An evaluation on a large-scale OSS execution dataset shows that HeteroGAT-Rank effectively highlights meaningful and interpretable behavioral indicators.
arXiv Detail & Related papers (2026-01-11T15:14:18Z) - SafeLoad: Efficient Admission Control Framework for Identifying Memory-Overloading Queries in Cloud Data Warehouses [59.68732483257323]
Memory overload is a common form of resource exhaustion in cloud data warehouses.<n>We propose SafeLoad, the first query admission control framework specifically designed to identify memory-overloading (MO) queries.<n>We show that SafeLoad achieves state-of-the-art prediction performance with low online and offline time overhead.
arXiv Detail & Related papers (2026-01-05T08:29:51Z) - CaveAgent: Transforming LLMs into Stateful Runtime Operators [31.548422546991915]
We present CaveAgent, a framework that transforms the paradigm from "LLM-as-Text-Generator" to "LLM-as-As-Runtime-Runtime"<n>CaveAgent achieves a 10.5% success rate improvement on retail tasks and reduces total token consumption by 28.4% in multi-turn scenarios.
arXiv Detail & Related papers (2026-01-04T15:32:47Z) - Just-in-time Episodic Feedback Hinter: Leveraging Offline Knowledge to Improve LLM Agents Adaptation [77.90555621662345]
We present JEF Hinter, an agentic system that distills offline traces into compact, context-aware hints.<n>A zooming mechanism highlights decisive steps in long trajectories, capturing both strategies and pitfalls.<n>Experiments on MiniWoB++, WorkArena-L1, and WebArena-Lite show that JEF Hinter consistently outperforms strong baselines.
arXiv Detail & Related papers (2025-10-05T21:34:42Z) - DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents [52.92354372596197]
Large Language Models (LLMs) are increasingly central to agentic systems due to their strong reasoning and planning capabilities.<n>This interaction also introduces the risk of prompt injection attacks, where malicious inputs from external sources can mislead the agent's behavior.<n>We propose a Dynamic Rule-based Isolation Framework for Trustworthy agentic systems, which enforces both control and data-level constraints.
arXiv Detail & Related papers (2025-06-13T05:01:09Z) - How Robust Are Router-LLMs? Analysis of the Fragility of LLM Routing Capabilities [62.474732677086855]
Large language model (LLM) routing has emerged as a crucial strategy for balancing computational costs with performance.<n>We propose the DSC benchmark: Diverse, Simple, and Categorized, an evaluation framework that categorizes router performance across a broad spectrum of query types.
arXiv Detail & Related papers (2025-03-20T19:52:30Z) - Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams [78.72965584414368]
We present Flash-VStream, a video-language model that simulates the memory mechanism of human.
Compared to existing models, Flash-VStream achieves significant reductions in latency inference and VRAM consumption.
We propose VStream-QA, a novel question answering benchmark specifically designed for online video streaming understanding.
arXiv Detail & Related papers (2024-06-12T11:07:55Z) - Fight Hardware with Hardware: System-wide Detection and Mitigation of Side-Channel Attacks using Performance Counters [45.493130647468675]
We present a kernel-level infrastructure that allows system-wide detection of malicious applications attempting to exploit cache-based side-channel attacks.
This infrastructure relies on hardware performance counters to collect information at runtime from all applications running on the machine.
High-level detection metrics are derived from these measurements to maximize the likelihood of promptly detecting a malicious application.
arXiv Detail & Related papers (2024-02-18T15:45:38Z) - Monitoring Auditable Claims in the Cloud [0.0]
We propose a flexible monitoring approach that is independent of the implementation of the observed system.
Our approach is based on combining distributed Datalog-based programs with tamper-proof storage based on Trillian.
We apply our approach to an industrial use case that uses a cloud infrastructure for orchestrating unmanned air vehicles.
arXiv Detail & Related papers (2023-12-19T11:21:18Z) - Toward a real-time TCP SYN Flood DDoS mitigation using Adaptive Neuro-Fuzzy classifier and SDN Assistance in Fog Computing [0.31318403497744784]
We propose mitigation of Fog computing-based SYN Flood DDoS attacks using an Adaptive Neuro-Fuzzy Inference System (ANFIS) and Software Defined Networking (SDN) Assistance (FASA)
The simulation results show that FASA system outperforms other algorithms in terms of accuracy, precision, recall, and F1-score.
arXiv Detail & Related papers (2023-11-27T08:54:00Z) - Pathway: a fast and flexible unified stream data processing framework
for analytical and Machine Learning applications [7.850979932441607]
Pathway is a new unified data processing framework that can run workloads on both bounded and unbounded data streams.
We describe the system and present benchmarking results which demonstrate its capabilities in both batch and streaming contexts.
arXiv Detail & Related papers (2023-07-12T08:27:37Z) - Networked Online Learning for Control of Safety-Critical
Resource-Constrained Systems based on Gaussian Processes [9.544146562919792]
We propose a novel networked online learning approach based on Gaussian process regression.
We propose an effective data transmission scheme between the local system and the cloud taking bandwidth limitations and time delay of the transmission channel into account.
arXiv Detail & Related papers (2022-02-23T13:12:12Z) - LogLAB: Attention-Based Labeling of Log Data Anomalies via Weak
Supervision [63.08516384181491]
We present LogLAB, a novel modeling approach for automated labeling of log messages without requiring manual work by experts.
Our method relies on estimated failure time windows provided by monitoring systems to produce precise labeled datasets in retrospect.
Our evaluation shows that LogLAB consistently outperforms nine benchmark approaches across three different datasets and maintains an F1-score of more than 0.98 even at large failure time windows.
arXiv Detail & Related papers (2021-11-02T15:16:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.