Related papers: Adaptive Budgeted Multi-Armed Bandits for IoT with Dynamic Resource Constraints

Adaptive Budgeted Multi-Armed Bandits for IoT with Dynamic Resource Constraints

URL: http://arxiv.org/abs/2505.02640v1
Date: Mon, 05 May 2025 13:33:39 GMT
Title: Adaptive Budgeted Multi-Armed Bandits for IoT with Dynamic Resource Constraints
Authors: Shubham Vaishnav, Praveen Kumar Donta, Sindri Magnússon,
Abstract summary: Internet of Things systems increasingly operate in environments where devices must respond in real time while managing fluctuating resource constraints.<n>We propose a novel Budgeted Multi-Armed Bandit framework tailored for IoT applications with dynamic operational limits.<n>Our model introduces a decaying violation budget, which permits limited constraint violations early in the learning process and gradually enforces stricter compliance over time.
Score: 5.694070924765916
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Internet of Things (IoT) systems increasingly operate in environments where devices must respond in real time while managing fluctuating resource constraints, including energy and bandwidth. Yet, current approaches often fall short in addressing scenarios where operational constraints evolve over time. To address these limitations, we propose a novel Budgeted Multi-Armed Bandit framework tailored for IoT applications with dynamic operational limits. Our model introduces a decaying violation budget, which permits limited constraint violations early in the learning process and gradually enforces stricter compliance over time. We present the Budgeted Upper Confidence Bound (UCB) algorithm, which adaptively balances performance optimization and compliance with time-varying constraints. We provide theoretical guarantees showing that Budgeted UCB achieves sublinear regret and logarithmic constraint violations over the learning horizon. Extensive simulations in a wireless communication setting show that our approach achieves faster adaptation and better constraint satisfaction than standard online learning methods. These results highlight the framework's potential for building adaptive, resource-aware IoT systems.

Related papers

RCCDA: Adaptive Model Updates in the Presence of Concept Drift under a Constrained Resource Budget [19.391900930310253]
Real-time machine learning algorithms are often faced with the challenge of adapting models to concept drift.<n>Existing solutions often depend on drift-detection methods that produce high computational overhead for resource-constrained environments.<n>We propose RCCDA: a dynamic model update policy that optimize ML training dynamics while ensuring strict compliance to predefined resource constraints.
arXiv Detail & Related papers (2025-05-30T02:49:42Z)
LeanTTA: A Backpropagation-Free and Stateless Approach to Quantized Test-Time Adaptation on Edge Devices [13.355021314836852]
We present LeanTTA, a novel backpropagation-free and stateless framework for quantized test-time adaptation tailored to edge devices.<n>Our approach minimizes computational costs by dynamically updating normalization statistics without backpropagation.<n>We validate our framework across sensor modalities, demonstrating significant improvements over state-of-the-art TTA methods.
arXiv Detail & Related papers (2025-03-20T06:27:09Z)
Offline Critic-Guided Diffusion Policy for Multi-User Delay-Constrained Scheduling [29.431945795881976]
We propose a novel offline reinforcement learning-based algorithm, named underlineScheduling.<n>It learns efficient scheduling policies purely from pre-collected emphoffline data.<n>We show that SOCD is resilient to various system dynamics, including partially observable and large-scale environments.
arXiv Detail & Related papers (2025-01-22T15:13:21Z)
Reinforcement Learning Constrained Beam Search for Parameter Optimization of Paper Drying Under Flexible Constraints [7.014163329716659]
We propose Reinforcement Learning Constrained Beam Search (RLCBS) for inference-time refinement in optimization problems.<n>Our results demonstrate that RLCBS outperforms NSGA-II under complex design constraints on drying module configurations at inference-time.
arXiv Detail & Related papers (2025-01-21T23:16:19Z)
Resilient Constrained Reinforcement Learning [87.4374430686956]
We study a class of constrained reinforcement learning (RL) problems in which multiple constraint specifications are not identified before study. It is challenging to identify appropriate constraint specifications due to the undefined trade-off between the reward training objective and the constraint satisfaction. We propose a new constrained RL approach that searches for policy and constraint specifications together.
arXiv Detail & Related papers (2023-12-28T18:28:23Z)
Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data. In this paper, we propose an adaptive scheme for action quantization. We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z)
A Constraint Enforcement Deep Reinforcement Learning Framework for Optimal Energy Storage Systems Dispatch [0.0]
The optimal dispatch of energy storage systems (ESSs) presents formidable challenges due to fluctuations in dynamic prices, demand consumption, and renewable-based energy generation. By exploiting the generalization capabilities of deep neural networks (DNNs), deep reinforcement learning (DRL) algorithms can learn good-quality control models that adaptively respond to distribution networks' nature. We propose a DRL framework that effectively handles continuous action spaces while strictly enforcing the environments and action space operational constraints during online operation.
arXiv Detail & Related papers (2023-07-26T17:12:04Z)
Neural Fields with Hard Constraints of Arbitrary Differential Order [61.49418682745144]
We develop a series of approaches for enforcing hard constraints on neural fields. The constraints can be specified as a linear operator applied to the neural field and its derivatives. Our approaches are demonstrated in a wide range of real-world applications.
arXiv Detail & Related papers (2023-06-15T08:33:52Z)
Dynamic Scheduling for Federated Edge Learning with Streaming Data [56.91063444859008]
We consider a Federated Edge Learning (FEEL) system where training data are randomly generated over time at a set of distributed edge devices with long-term energy constraints. Due to limited communication resources and latency requirements, only a subset of devices is scheduled for participating in the local training process in every iteration.
arXiv Detail & Related papers (2023-05-02T07:41:16Z)
Coordinated Online Learning for Multi-Agent Systems with Coupled Constraints and Perturbed Utility Observations [91.02019381927236]
We introduce a novel method to steer the agents toward a stable population state, fulfilling the given resource constraints. The proposed method is a decentralized resource pricing method based on the resource loads resulting from the augmentation of the game's Lagrangian.
arXiv Detail & Related papers (2020-10-21T10:11:17Z)
Green Offloading in Fog-Assisted IoT Systems: An Online Perspective Integrating Learning and Control [20.68436820937947]
In fog-assisted IoT systems, it is a common practice to offload tasks from IoT devices to their nearby fog nodes to reduce task processing latencies and energy consumptions. In this paper, we formulate such a task offloading problem with unknown system dynamics as a multi-armed bandit (CMAB) problem with long-term constraints on time-averaged energy consumptions. Through an effective integration of online learning and online control, we propose a textitLearning-Aided Green Offloading (LAGO) scheme.
arXiv Detail & Related papers (2020-08-01T07:27:24Z)
Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO) We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.