INSIGHT: A Survey of In-Network Systems for Intelligent, High-Efficiency AI and Topology Optimization
- URL: http://arxiv.org/abs/2505.24269v1
- Date: Fri, 30 May 2025 06:47:55 GMT
- Title: INSIGHT: A Survey of In-Network Systems for Intelligent, High-Efficiency AI and Topology Optimization
- Authors: Aleksandr Algazinov, Joydeep Chandra, Matt Laing,
- Abstract summary: In-network AI is a transformative approach to addressing the escalating demands of Artificial Intelligence (AI) on network infrastructure.<n>This paper provides a comprehensive analysis of optimizing in-network computation for AI.<n>It examines methodologies for mapping AI models onto resource-constrained network devices, addressing challenges like limited memory and computational capabilities.
- Score: 43.37351326629751
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In-network computation represents a transformative approach to addressing the escalating demands of Artificial Intelligence (AI) workloads on network infrastructure. By leveraging the processing capabilities of network devices such as switches, routers, and Network Interface Cards (NICs), this paradigm enables AI computations to be performed directly within the network fabric, significantly reducing latency, enhancing throughput, and optimizing resource utilization. This paper provides a comprehensive analysis of optimizing in-network computation for AI, exploring the evolution of programmable network architectures, such as Software-Defined Networking (SDN) and Programmable Data Planes (PDPs), and their convergence with AI. It examines methodologies for mapping AI models onto resource-constrained network devices, addressing challenges like limited memory and computational capabilities through efficient algorithm design and model compression techniques. The paper also highlights advancements in distributed learning, particularly in-network aggregation, and the potential of federated learning to enhance privacy and scalability. Frameworks like Planter and Quark are discussed for simplifying development, alongside key applications such as intelligent network monitoring, intrusion detection, traffic management, and Edge AI. Future research directions, including runtime programmability, standardized benchmarks, and new applications paradigms, are proposed to advance this rapidly evolving field. This survey underscores the potential of in-network AI to create intelligent, efficient, and responsive networks capable of meeting the demands of next-generation AI applications.
Related papers
- AI Flow: Perspectives, Scenarios, and Approaches [51.38621621775711]
We introduce AI Flow, a framework that integrates cutting-edge IT and CT advancements.<n>First, device-edge-cloud framework serves as the foundation, which integrates end devices, edge servers, and cloud clusters.<n>Second, we introduce the concept of familial models, which refers to a series of different-sized models with aligned hidden features.<n>Third, connectivity- and interaction-based intelligence emergence is a novel paradigm of AI Flow.
arXiv Detail & Related papers (2025-06-14T12:43:07Z) - Edge-Cloud Collaborative Computing on Distributed Intelligence and Model Optimization: A Survey [59.52058740470727]
Edge-cloud collaborative computing (ECCC) has emerged as a pivotal paradigm for addressing the computational demands of modern intelligent applications.<n>Recent advancements in AI, particularly deep learning and large language models (LLMs), have dramatically enhanced the capabilities of these distributed systems.<n>This survey provides a structured tutorial on fundamental architectures, enabling technologies, and emerging applications.
arXiv Detail & Related papers (2025-05-03T13:55:38Z) - Toward Agentic AI: Generative Information Retrieval Inspired Intelligent Communications and Networking [87.82985288731489]
Agentic AI has emerged as a key paradigm for intelligent communications and networking.<n>This article emphasizes the role of knowledge acquisition, processing, and retrieval in agentic AI for telecom systems.
arXiv Detail & Related papers (2025-02-24T06:02:25Z) - Optimal In-Network Distribution of Learning Functions for a Secure-by-Design Programmable Data Plane of Next-Generation Networks [2.563180814294141]
This paper focuses on the deployment of in-network learning models with the aim of implementing fully distributed intrusion detection systems (IDS) or intrusion prevention systems (IPS)<n>A model is proposed for the optimal distribution of the IDS/IPS workload among data plane devices with the aim of ensuring complete network security without excessively burdening the normal operations of the devices.
arXiv Detail & Related papers (2024-11-27T14:29:53Z) - Towards Scalable Wireless Federated Learning: Challenges and Solutions [40.68297639420033]
federated learning (FL) emerges as an effective distributed machine learning framework.
We discuss the challenges and solutions of achieving scalable wireless FL from the perspectives of both network design and resource orchestration.
arXiv Detail & Related papers (2023-10-08T08:55:03Z) - Design Principles for Model Generalization and Scalable AI Integration
in Radio Access Networks [2.846642778157227]
This paper emphasizes the pivotal role of achieving model generalization in enhancing performance and enabling scalable AI integration within radio communications.
We outline design principles for model generalization in three key domains: environment for robustness, intents for adaptability to system objectives, and control tasks for reducing AI-driven control loops.
We propose a learning architecture that leverages centralization of training and data management functionalities, combined with distributed data generation.
arXiv Detail & Related papers (2023-06-09T20:46:31Z) - AI in 6G: Energy-Efficient Distributed Machine Learning for Multilayer
Heterogeneous Networks [7.318997639507269]
We propose a novel layer-based HetNet architecture which distributes tasks associated with different machine learning approaches across network layers and entities.
Such a HetNet boasts multiple access schemes as well as device-to-device (D2D) communications to enhance energy efficiency.
arXiv Detail & Related papers (2022-06-04T22:03:19Z) - Deep Reinforcement Learning-Aided RAN Slicing Enforcement for B5G
Latency Sensitive Services [10.718353079920007]
This paper presents a novel architecture that leverages Deep Reinforcement Learning at the edge of the network in order to address Radio Access Network Slicing and Radio Resource Management.
The effectiveness of our proposal against baseline methodologies is investigated through computer simulation, by considering an autonomous-driving use-case.
arXiv Detail & Related papers (2021-03-18T14:18:34Z) - Towards AIOps in Edge Computing Environments [60.27785717687999]
This paper describes the system design of an AIOps platform which is applicable in heterogeneous, distributed environments.
It is feasible to collect metrics with a high frequency and simultaneously run specific anomaly detection algorithms directly on edge devices.
arXiv Detail & Related papers (2021-02-12T09:33:00Z) - Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G
Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC.
To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.