Dynamic and Distributed Routing in IoT Networks based on Multi-Objective Q-Learning
- URL: http://arxiv.org/abs/2505.00918v1
- Date: Thu, 01 May 2025 23:34:35 GMT
- Title: Dynamic and Distributed Routing in IoT Networks based on Multi-Objective Q-Learning
- Authors: Shubham Vaishnav, Praveen Kumar Donta, Sindri Magnússon,
- Abstract summary: A critical task in IoT networks is sensing and transmitting information over the network.<n>We propose a novel dynamic and distributed routing based on multi-objective Q-learning.<n>We also propose a novel greedy policy scheme to take near-optimal decisions for unexpected preference changes.
- Score: 5.694070924765916
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The last few decades have witnessed a rapid increase in IoT devices owing to their wide range of applications, such as smart healthcare monitoring systems, smart cities, and environmental monitoring. A critical task in IoT networks is sensing and transmitting information over the network. The IoT nodes gather data by sensing the environment and then transmit this data to a destination node via multi-hop communication, following some routing protocols. These protocols are usually designed to optimize possibly contradictory objectives, such as maximizing packet delivery ratio and energy efficiency. While most literature has focused on optimizing a static objective that remains unchanged, many real-world IoT applications require adapting to rapidly shifting priorities. For example, in monitoring systems, some transmissions are time-critical and require a high priority on low latency, while other transmissions are less urgent and instead prioritize energy efficiency. To meet such dynamic demands, we propose novel dynamic and distributed routing based on multiobjective Q-learning that can adapt to changes in preferences in real-time. Our algorithm builds on ideas from both multi-objective optimization and Q-learning. We also propose a novel greedy interpolation policy scheme to take near-optimal decisions for unexpected preference changes. The proposed scheme can approximate and utilize the Pareto-efficient solutions for dynamic preferences, thus utilizing past knowledge to adapt to unpredictable preferences quickly during runtime. Simulation results show that the proposed scheme outperforms state-of-the-art algorithms for various exploration strategies, preference variation patterns, and important metrics like overall reward, energy efficiency, and packet delivery ratio.
Related papers
- Adaptive routing protocols for determining optimal paths in AI multi-agent systems: a priority- and learning-enhanced approach [0.0]
This paper introduces an enhanced, adaptive routing tailored for AI multi-agent networks.<n>We incorporate multi-faceted parameters such as task complexity, user request priority, agent capabilities, bandwidth, latency, load, model sophistication, and reliability.
arXiv Detail & Related papers (2025-03-10T13:16:54Z) - DRL Optimization Trajectory Generation via Wireless Network Intent-Guided Diffusion Models for Optimizing Resource Allocation [58.62766376631344]
We propose a customized wireless network intent (WNI-G) model to address different state variations of wireless communication networks.
Extensive simulation achieves greater stability in spectral efficiency and variations of traditional DRL models in dynamic communication systems.
arXiv Detail & Related papers (2024-10-18T14:04:38Z) - Improved Q-learning based Multi-hop Routing for UAV-Assisted Communication [4.799822253865053]
This paper proposes a novel, Improved Q-learning-based Multi-hop Routing (IQMR) algorithm for optimal UAV-assisted communication systems.
Using Q(lambda) learning for routing decisions, IQMR substantially enhances energy efficiency and network data throughput.
arXiv Detail & Related papers (2024-08-17T06:24:31Z) - Generalized Multi-Objective Reinforcement Learning with Envelope Updates in URLLC-enabled Vehicular Networks [12.323383132739195]
We develop a novel multi-objective reinforcement learning framework to jointly optimize wireless network selection and autonomous driving policies.<n>The proposed framework is designed to maximize the traffic flow and minimize collisions by controlling the vehicle's motion dynamics.<n>The proposed policies enable autonomous vehicles to adopt safe driving behaviors with improved connectivity.
arXiv Detail & Related papers (2024-05-18T16:31:32Z) - Multi-Objective Optimization for UAV Swarm-Assisted IoT with Virtual
Antenna Arrays [55.736718475856726]
Unmanned aerial vehicle (UAV) network is a promising technology for assisting Internet-of-Things (IoT)
Existing UAV-assisted data harvesting and dissemination schemes require UAVs to frequently fly between the IoTs and access points.
We introduce collaborative beamforming into IoTs and UAVs simultaneously to achieve energy and time-efficient data harvesting and dissemination.
arXiv Detail & Related papers (2023-08-03T02:49:50Z) - Multi-objective Deep Reinforcement Learning for Mobile Edge Computing [11.966938107719903]
Mobile edge computing (MEC) is essential for next-generation mobile network applications that prioritize various performance metrics, including delays and energy consumption.
In this study, we formulate a multi-objective offloading problem for MEC with multiple edges to minimize expected long-term energy consumption and transmission delay.
We introduce a well-designed state encoding method for constructing features for multiple edges in MEC systems, a sophisticated reward function for accurately computing the utilities of delay and energy consumption.
arXiv Detail & Related papers (2023-07-05T16:36:42Z) - Low Complexity Adaptive Machine Learning Approaches for End-to-End
Latency Prediction [0.0]
This work is the design of efficient, low-cost adaptive algorithms for estimation, monitoring and prediction.
We focus on end-to-end latency prediction, for which we illustrate our approaches and results on data obtained from a public generator provided after the recent international challenge on GNN.
arXiv Detail & Related papers (2023-01-31T10:29:11Z) - AI-aided Traffic Control Scheme for M2M Communications in the Internet
of Vehicles [61.21359293642559]
The dynamics of traffic and the heterogeneous requirements of different IoV applications are not considered in most existing studies.
We consider a hybrid traffic control scheme and use proximal policy optimization (PPO) method to tackle it.
arXiv Detail & Related papers (2022-03-05T10:54:05Z) - Multi-Exit Semantic Segmentation Networks [78.44441236864057]
We propose a framework for converting state-of-the-art segmentation models to MESS networks.
specially trained CNNs that employ parametrised early exits along their depth to save during inference on easier samples.
We co-optimise the number, placement and architecture of the attached segmentation heads, along with the exit policy, to adapt to the device capabilities and application-specific requirements.
arXiv Detail & Related papers (2021-06-07T11:37:03Z) - Learning to Continuously Optimize Wireless Resource in a Dynamic
Environment: A Bilevel Optimization Perspective [52.497514255040514]
This work develops a new approach that enables data-driven methods to continuously learn and optimize resource allocation strategies in a dynamic environment.
We propose to build the notion of continual learning into wireless system design, so that the learning model can incrementally adapt to the new episodes.
Our design is based on a novel bilevel optimization formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2021-05-03T07:23:39Z) - Data-Driven Random Access Optimization in Multi-Cell IoT Networks with
NOMA [78.60275748518589]
Non-orthogonal multiple access (NOMA) is a key technology to enable massive machine type communications (mMTC) in 5G networks and beyond.
In this paper, NOMA is applied to improve the random access efficiency in high-density spatially-distributed multi-cell wireless IoT networks.
A novel formulation of random channel access management is proposed, in which the transmission probability of each IoT device is tuned to maximize the geometric mean of users' expected capacity.
arXiv Detail & Related papers (2021-01-02T15:21:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.