Flash-Fusion: Enabling Expressive, Low-Latency Queries on IoT Sensor Streams with LLMs
- URL: http://arxiv.org/abs/2511.11885v1
- Date: Fri, 14 Nov 2025 21:34:28 GMT
- Title: Flash-Fusion: Enabling Expressive, Low-Latency Queries on IoT Sensor Streams with LLMs
- Authors: Kausar Patherya, Ashutosh Dhekne, Francisco Romero,
- Abstract summary: Flash-Fusion is an end-to-end edge-cloud system that reduces the IoT data collection and analysis burden on users.<n>It achieves a 95% latency reduction and 98% decrease in token usage and cost while maintaining high-quality responses.
- Score: 2.580765958706854
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Smart cities and pervasive IoT deployments have generated interest in IoT data analysis across transportation and urban planning. At the same time, Large Language Models offer a new interface for exploring IoT data - particularly through natural language. Users today face two key challenges when working with IoT data using LLMs: (1) data collection infrastructure is expensive, producing terabytes of low-level sensor readings that are too granular for direct use, and (2) data analysis is slow, requiring iterative effort and technical expertise. Directly feeding all IoT telemetry to LLMs is impractical due to finite context windows, prohibitive token costs at scale, and non-interactive latencies. What is missing is a system that first parses a user's query to identify the analytical task, then selects the relevant data slices, and finally chooses the right representation before invoking an LLM. We present Flash-Fusion, an end-to-end edge-cloud system that reduces the IoT data collection and analysis burden on users. Two principles guide its design: (1) edge-based statistical summarization (achieving 73.5% data reduction) to address data volume, and (2) cloud-based query planning that clusters behavioral data and assembles context-rich prompts to address data interpretation. We deploy Flash-Fusion on a university bus fleet and evaluate it against a baseline that feeds raw data to a state-of-the-art LLM. Flash-Fusion achieves a 95% latency reduction and 98% decrease in token usage and cost while maintaining high-quality responses. It enables personas across disciplines - safety officers, urban planners, fleet managers, and data scientists - to efficiently iterate over IoT data without the burden of manual query authoring or preprocessing.
Related papers
- Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs [66.63911043019294]
Data preparation aims to denoise raw datasets, uncover cross-dataset relationships, and extract valuable insights from them.<n>This paper focuses on the use of LLM techniques to prepare data for diverse downstream tasks.<n>We introduce a task-centric taxonomy that organizes the field into three major tasks: data cleaning, standardization, error processing, imputation, data integration, and data enrichment.
arXiv Detail & Related papers (2026-01-22T12:02:45Z) - Towards Compositional Generalization in LLMs for Smart Contract Security: A Case Study on Reentrancy Vulnerabilities [35.39583123277091]
This paper proposes a post-training algorithm based on atomic task decomposition and fusion.<n>We decompose the reentrancy vulnerability detection task into four linearly independent atomic tasks.<n>By training on synthetic datasets, we generate three compiler-verified datasets.<n>We then employ the Slither tool to extract structural information from the control flow graph and data flow graph.
arXiv Detail & Related papers (2026-01-11T13:52:07Z) - Long Grounded Thoughts: Distilling Compositional Visual Reasoning Chains at Scale [70.23466957404891]
We introduce a new reasoning data generation framework spanning diverse skills and levels of complexity with over 1M high-quality synthetic vision-centric questions.<n>We show that finetuning Qwen2.5-VL-7B on our data outperforms all open-data baselines across all evaluated vision-centric benchmarks.
arXiv Detail & Related papers (2025-11-07T20:50:54Z) - TouchTTS: An Embarrassingly Simple TTS Framework that Everyone Can Touch [18.661974399115007]
Recent LLM-based TTS works typically employ complex data processing pipelines to obtain high-quality training data.<n>In this work, we leverage a noise-robust audio tokenizer (S3Tokenizer) to design a simplified yet effective TTS data processing pipeline.<n>This pipeline maintains data quality while substantially reducing data acquisition costs, achieving a data retention rate of over 50%.
arXiv Detail & Related papers (2024-12-11T09:38:50Z) - Federated In-Context LLM Agent Learning [3.4757641432843487]
Large Language Models (LLMs) have revolutionized intelligent services by enabling logical reasoning, tool use, and interaction with external systems as agents.<n>In this paper, we propose a novel privacy-preserving Federated In-context LLM Agent Learning (FICAL) algorithm.<n>The results show that FICAL has competitive performance compared to other SOTA baselines with a significant communication cost decrease of $mathbf3.33times105$ times.
arXiv Detail & Related papers (2024-12-11T03:00:24Z) - Leveraging Foundation Models for Zero-Shot IoT Sensing [5.319176383069102]
Deep learning models are increasingly deployed on edge Internet of Things (IoT) devices.
ZSL aims to classify data of unseen classes with the help of semantic information.
In this work, we align the IoT data embeddings with the semantic embeddings generated by an FM's text encoder for zero-shot IoT sensing.
arXiv Detail & Related papers (2024-07-29T11:16:48Z) - Energy-Efficient Edge Learning via Joint Data Deepening-and-Prefetching [9.468399367975984]
We propose a novel offloading architecture called joint data deepening-and-prefetching (JD2P)
JD2P is feature-by-feature offloading comprising two key techniques.
We evaluate the effectiveness of JD2P through experiments using the MNIST dataset.
arXiv Detail & Related papers (2024-02-19T08:12:47Z) - Federated Fine-Tuning of LLMs on the Very Edge: The Good, the Bad, the Ugly [62.473245910234304]
This paper takes a hardware-centric approach to explore how Large Language Models can be brought to modern edge computing systems.
We provide a micro-level hardware benchmark, compare the model FLOP utilization to a state-of-the-art data center GPU, and study the network utilization in realistic conditions.
arXiv Detail & Related papers (2023-10-04T20:27:20Z) - Online Data Selection for Federated Learning with Limited Storage [53.46789303416799]
Federated Learning (FL) has been proposed to achieve distributed machine learning among networked devices.
The impact of on-device storage on the performance of FL is still not explored.
In this work, we take the first step to consider the online data selection for FL with limited on-device storage.
arXiv Detail & Related papers (2022-09-01T03:27:33Z) - Deep Reinforcement Learning Assisted Federated Learning Algorithm for
Data Management of IIoT [82.33080550378068]
The continuous expanded scale of the industrial Internet of Things (IIoT) leads to IIoT equipments generating massive amounts of user data every moment.
How to manage these time series data in an efficient and safe way in the field of IIoT is still an open issue.
This paper studies the FL technology applications to manage IIoT equipment data in wireless network environments.
arXiv Detail & Related papers (2022-02-03T07:12:36Z) - Optimizing Resource-Efficiency for Federated Edge Intelligence in IoT
Networks [96.24723959137218]
We study an edge intelligence-based IoT network in which a set of edge servers learn a shared model using federated learning (FL)
We propose a novel framework, called federated edge intelligence (FEI), that allows edge servers to evaluate the required number of data samples according to the energy cost of the IoT network.
We prove that our proposed algorithm does not cause any data leakage nor disclose any topological information of the IoT network.
arXiv Detail & Related papers (2020-11-25T12:51:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.