Related papers: Adaptive Stream Processing on Edge Devices through Active Inference

Adaptive Stream Processing on Edge Devices through Active Inference

URL: http://arxiv.org/abs/2409.17937v1
Date: Thu, 26 Sep 2024 15:12:41 GMT
Title: Adaptive Stream Processing on Edge Devices through Active Inference
Authors: Boris Sedlak, Victor Casamayor Pujol, Andrea Morichetta, Praveen Kumar Donta, Schahram Dustdar,
Abstract summary: We present a novel Machine Learning paradigm based on Active Inference (AIF) AIF describes how the brain constantly predicts and evaluates sensory information to decrease long-term surprise. Our method guarantees full transparency on the decision making, making the interpretation of the results and the troubleshooting effortless.
Score: 5.5676731834895765
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The current scenario of IoT is witnessing a constant increase on the volume of data, which is generated in constant stream, calling for novel architectural and logical solutions for processing it. Moving the data handling towards the edge of the computing spectrum guarantees better distribution of load and, in principle, lower latency and better privacy. However, managing such a structure is complex, especially when requirements, also referred to Service Level Objectives (SLOs), specified by applications' owners and infrastructure managers need to be ensured. Despite the rich number of proposals of Machine Learning (ML) based management solutions, researchers and practitioners yet struggle to guarantee long-term prediction and control, and accurate troubleshooting. Therefore, we present a novel ML paradigm based on Active Inference (AIF) -- a concept from neuroscience that describes how the brain constantly predicts and evaluates sensory information to decrease long-term surprise. We implement it and evaluate it in a heterogeneous real stream processing use case, where an AIF-based agent continuously optimizes the fulfillment of three SLOs for three autonomous driving services running on multiple devices. The agent used causal knowledge to gradually develop an understanding of how its actions are related to requirements fulfillment, and which configurations to favor. Through this approach, our agent requires up to thirty iterations to converge to the optimal solution, showing the capability of offering accurate results in a short amount of time. Furthermore, thanks to AIF and its causal structures, our method guarantees full transparency on the decision making, making the interpretation of the results and the troubleshooting effortless.

Related papers

AI/ML Life Cycle Management for Interoperable AI Native RAN [50.61227317567369]
Artificial intelligence (AI) and machine learning (ML) models are rapidly permeating the 5G Radio Access Network (RAN)<n>These developments lay the foundation for AI-native transceivers as a key enabler for 6G.
arXiv Detail & Related papers (2025-07-24T16:04:59Z)
The Cost of Dynamic Reasoning: Demystifying AI Agents and Test-Time Scaling from an AI Infrastructure Perspective [3.0868637098088403]
Large-language-model (LLM)-based AI agents have recently showcased impressive versatility by employing dynamic reasoning.<n>This paper presents the first comprehensive system-level analysis of AI agents, quantifying their resource usage, latency behavior, energy consumption, and test-time scaling strategies.<n>Our findings reveal that while agents improve accuracy with increased compute, they suffer from rapidly diminishing returns, widening latency variance, and unsustainable infrastructure costs.
arXiv Detail & Related papers (2025-06-04T14:37:54Z)
ORMind: A Cognitive-Inspired End-to-End Reasoning Framework for Operations Research [53.736407871322314]
We introduce ORMind, a cognitive-inspired framework that enhances optimization through counterfactual reasoning.<n>Our approach emulates human cognition, implementing an end-to-end workflow that transforms requirements into mathematical models and executable code.<n>It is currently being tested internally in Lenovo's AI Assistant, with plans to enhance optimization capabilities for both business and consumer customers.
arXiv Detail & Related papers (2025-06-02T05:11:21Z)
Does Machine Unlearning Truly Remove Model Knowledge? A Framework for Auditing Unlearning in LLMs [58.24692529185971]
We introduce a comprehensive auditing framework for unlearning evaluation comprising three benchmark datasets, six unlearning algorithms, and five prompt-based auditing methods.<n>We evaluate the effectiveness and robustness of different unlearning strategies.
arXiv Detail & Related papers (2025-05-29T09:19:07Z)
DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree Traversal [55.13854171147104]
Large Language Models (LLMs) have revolutionized various domains, including natural language processing, data analysis, and software development. We present Dynamic Action Re-Sampling (DARS), a novel inference time compute scaling approach for coding agents. We evaluate our approach on SWE-Bench Lite benchmark, demonstrating that this scaling strategy achieves a pass@k score of 55% with Claude 3.5 Sonnet V2.
arXiv Detail & Related papers (2025-03-18T14:02:59Z)
An Innovative Data-Driven and Adaptive Reinforcement Learning Approach for Context-Aware Prescriptive Process Monitoring [3.4437362489150254]
We present a novel framework called Fine-Tuned Offline Reinforcement Learning Augmented Process Sequence Optimization.<n>FORLAPS aims to identify optimal execution paths in business processes by leveraging reinforcement learning enhanced with a state-dependent reward shaping mechanism.<n>We show that FORLAPS achieves 31% savings in resource time spent and a 23% reduction in process time span.
arXiv Detail & Related papers (2025-01-17T20:31:35Z)
Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG [0.8463972278020965]
Large Language Models (LLMs) have revolutionized artificial intelligence (AI) by enabling human like text generation and natural language understanding. Retrieval Augmented Generation (RAG) has emerged as a solution, enhancing LLMs by integrating real time data retrieval to provide contextually relevant responses. Agentic Retrieval-Augmented Generation (RAG) transcends these limitations by embedding autonomous AI agents into the RAG pipeline.
arXiv Detail & Related papers (2025-01-15T20:40:25Z)
Towards Human-Level Understanding of Complex Process Engineering Schematics: A Pedagogical, Introspective Multi-Agent Framework for Open-Domain Question Answering [0.0]
In the chemical and process industries, Process Flow Diagrams (PFDs) and Piping and Instrumentation Diagrams (P&IDs) are critical for design, construction, and maintenance. Recent advancements in Generative AI have shown promise in understanding and interpreting process diagrams for Visual Question Answering (VQA) We propose a secure, on-premises enterprise solution using a hierarchical, multi-agent Retrieval Augmented Generation (RAG) framework.
arXiv Detail & Related papers (2024-08-24T19:34:04Z)
Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making [51.737762570776006]
LLM-ACTR is a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making. Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations. Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability.
arXiv Detail & Related papers (2024-08-17T11:49:53Z)
Age-Based Scheduling for Mobile Edge Computing: A Deep Reinforcement Learning Approach [58.911515417156174]
We propose a new definition of Age of Information (AoI) and, based on the redefined AoI, we formulate an online AoI problem for MEC systems. We introduce Post-Decision States (PDSs) to exploit the partial knowledge of the system's dynamics. We also combine PDSs with deep RL to further improve the algorithm's applicability, scalability, and robustness.
arXiv Detail & Related papers (2023-12-01T01:30:49Z)
Active Inference on the Edge: A Design Study [5.815300670677979]
Active Inference (ACI) is a concept from neuroscience that describes how the brain constantly predicts and evaluates sensory information to decrease long-term surprise. We show how our ACI agent was able to quickly and traceably solve an optimization problem while fulfilling requirements.
arXiv Detail & Related papers (2023-11-17T16:03:04Z)
Intelligent Proactive Fault Tolerance at the Edge through Resource Usage Prediction [0.7046417074932255]
We propose an Intelligent Proactive Fault Tolerance (IPFT) method that leverages the edge resource usage predictions through Recurrent Neural Networks (RNN) In this paper, we focus on the process-faults, which are related with the inability of the infrastructure to provide Quality of Service (QoS) in acceptable ranges due to the lack of processing power.
arXiv Detail & Related papers (2023-02-09T00:42:34Z)
Multi-Agent Reinforcement Learning for Long-Term Network Resource Allocation through Auction: a V2X Application [7.326507804995567]
We formulate offloading of computational tasks from a dynamic group of mobile agents (e.g., cars) as decentralized decision making among autonomous agents. We design an interaction mechanism that incentivizes such agents to align private and system goals by balancing between competition and cooperation. We propose a novel multi-agent online learning algorithm that learns with partial, delayed and noisy state information.
arXiv Detail & Related papers (2022-07-29T10:29:06Z)
Task-Oriented Sensing, Computation, and Communication Integration for Multi-Device Edge AI [108.08079323459822]
This paper studies a new multi-intelligent edge artificial-latency (AI) system, which jointly exploits the AI model split inference and integrated sensing and communication (ISAC) We measure the inference accuracy by adopting an approximate but tractable metric, namely discriminant gain.
arXiv Detail & Related papers (2022-07-03T06:57:07Z)
On Efficient Uncertainty Estimation for Resource-Constrained Mobile Applications [0.0]
Predictive uncertainty supplements model predictions and enables improved functionality of downstream tasks. We tackle this problem by building upon Monte Carlo Dropout (MCDO) models using the Axolotl framework. We conduct experiments on (1) a multi-class classification task using the CIFAR10 dataset, and (2) a more complex human body segmentation task.
arXiv Detail & Related papers (2021-11-11T22:24:15Z)
Towards AIOps in Edge Computing Environments [60.27785717687999]
This paper describes the system design of an AIOps platform which is applicable in heterogeneous, distributed environments. It is feasible to collect metrics with a high frequency and simultaneously run specific anomaly detection algorithms directly on edge devices.
arXiv Detail & Related papers (2021-02-12T09:33:00Z)
Reconfigurable Intelligent Surface Assisted Mobile Edge Computing with Heterogeneous Learning Tasks [53.1636151439562]
Mobile edge computing (MEC) provides a natural platform for AI applications. We present an infrastructure to perform machine learning tasks at an MEC with the assistance of a reconfigurable intelligent surface (RIS) Specifically, we minimize the learning error of all participating users by jointly optimizing transmit power of mobile users, beamforming vectors of the base station, and the phase-shift matrix of the RIS.
arXiv Detail & Related papers (2020-12-25T07:08:50Z)
Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments. We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data. Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.