Learning-based Scheduling for Information Accuracy and Freshness in
Wireless Networks
- URL: http://arxiv.org/abs/2310.15705v1
- Date: Tue, 24 Oct 2023 10:31:34 GMT
- Title: Learning-based Scheduling for Information Accuracy and Freshness in
Wireless Networks
- Authors: Hitesh Gudwani
- Abstract summary: We consider a system of multiple sources, a single communication channel, and a single monitoring station.
The probability of correct measurement and the probability of successful transmission of all the sources are unknown to the scheduler.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider a system of multiple sources, a single communication channel, and
a single monitoring station. Each source measures a time-varying quantity with
varying levels of accuracy and one of them sends its update to the monitoring
station via the channel. The probability of success of each attempted
communication is a function of the source scheduled for transmitting its
update. Both the probability of correct measurement and the probability of
successful transmission of all the sources are unknown to the scheduler. The
metric of interest is the reward received by the system which depends on the
accuracy of the last update received by the destination and the
Age-of-Information (AoI) of the system. We model our scheduling problem as a
variant of the multi-arm bandit problem with sources as different arms. We
compare the performance of all $4$ standard bandit policies, namely, ETC,
$\epsilon$-greedy, UCB, and TS suitably adjusted to our system model via
simulations. In addition, we provide analytical guarantees of $2$ of these
policies, ETC, and $\epsilon$-greedy. Finally, we characterize the lower bound
on the cumulative regret achievable by any policy.
Related papers
- Generalized Differentiable RANSAC [95.95627475224231]
$nabla$-RANSAC is a differentiable RANSAC that allows learning the entire randomized robust estimation pipeline.
$nabla$-RANSAC is superior to the state-of-the-art in terms of accuracy while running at a similar speed to its less accurate alternatives.
arXiv Detail & Related papers (2022-12-26T15:13:13Z) - Learning a Discrete Set of Optimal Allocation Rules in a Queueing System
with Unknown Service Rate [1.4094389874355762]
We study admission control for a system with unknown arrival and service rates.
In our model, at every job arrival, a dispatcher decides to assign the job to an available server or block it.
Our goal is to design a dispatching policy that maximizes the long-term average reward for the dispatcher.
arXiv Detail & Related papers (2022-02-04T22:39:03Z) - Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates [110.92598350897192]
Q-Learning has proven effective at learning a policy to perform control tasks.
estimation noise becomes a bias after the max operator in the policy improvement step.
We present Unbiased Soft Q-Learning (UQL), which extends the work of EQL from two action, finite state spaces to multi-action, infinite state Markov Decision Processes.
arXiv Detail & Related papers (2021-10-28T00:07:19Z) - Sampling-Based Robust Control of Autonomous Systems with Non-Gaussian
Noise [59.47042225257565]
We present a novel planning method that does not rely on any explicit representation of the noise distributions.
First, we abstract the continuous system into a discrete-state model that captures noise by probabilistic transitions between states.
We capture these bounds in the transition probability intervals of a so-called interval Markov decision process (iMDP)
arXiv Detail & Related papers (2021-10-25T06:18:55Z) - Finite-time System Identification and Adaptive Control in Autoregressive
Exogenous Systems [79.67879934935661]
We study the problem of system identification and adaptive control of unknown ARX systems.
We provide finite-time learning guarantees for the ARX systems under both open-loop and closed-loop data collection.
arXiv Detail & Related papers (2021-08-26T18:00:00Z) - Regret Analysis of Distributed Online LQR Control for Unknown LTI
Systems [8.832969171530056]
We study the distributed online linear quadratic regulator (LQR) problem for linear time-invariant (LTI) systems with unknown dynamics.
We propose a distributed variant of the online LQR algorithm where each agent computes its system estimate during an exploration stage.
We prove that our proposed algorithm scales $tildeO(T2/3)$, implying the consensus of the network over time.
arXiv Detail & Related papers (2021-05-15T23:02:58Z) - A Reinforcement Learning Approach to Age of Information in Multi-User
Networks with HARQ [1.5469452301122177]
Scheduling the transmission of time-sensitive information from a source node to multiple users over error-prone communication channels is studied.
Long-term average resource constraint is imposed on the source, which limits the average number of transmissions.
arXiv Detail & Related papers (2021-02-19T07:30:44Z) - Distributed Q-Learning with State Tracking for Multi-agent Networked
Control [61.63442612938345]
This paper studies distributed Q-learning for Linear Quadratic Regulator (LQR) in a multi-agent network.
We devise a state tracking (ST) based Q-learning algorithm to design optimal controllers for agents.
arXiv Detail & Related papers (2020-12-22T22:03:49Z) - Distributed Bandits: Probabilistic Communication on $d$-regular Graphs [5.33024001730262]
We study the decentralized multi-agent multi-armed bandit problem for agents that communicate with probability over a network defined by a $d$-regular graph.
We propose a new Upper Confidence Bound (UCB) based algorithm and analyze how agent-based strategies contribute to minimizing group regret.
arXiv Detail & Related papers (2020-11-16T04:53:54Z) - Superiority of Simplicity: A Lightweight Model for Network Device
Workload Prediction [58.98112070128482]
We propose a lightweight solution for series prediction based on historic observations.
It consists of a heterogeneous ensemble method composed of two models - a neural network and a mean predictor.
It achieves an overall $R2$ score of 0.10 on the available FedCSIS 2020 challenge dataset.
arXiv Detail & Related papers (2020-07-07T15:44:16Z) - Learning Algorithms for Minimizing Queue Length Regret [5.8010446129208155]
Packets randomly arrive to a transmitter's queue and wait to be successfully sent to the receiver.
The transmitter's objective is to quickly identify the best channel to minimize the number of packets in the queue over $T$ time slots.
We show that there exists a set of queue-length based policies that can obtain order optimal $O(1)$ queue length regret.
arXiv Detail & Related papers (2020-05-11T15:50:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.