Age Optimal Sampling for Unreliable Channels under Unknown Channel Statistics
- URL: http://arxiv.org/abs/2412.18119v1
- Date: Tue, 24 Dec 2024 03:06:22 GMT
- Title: Age Optimal Sampling for Unreliable Channels under Unknown Channel Statistics
- Authors: Hongyi He, Haoyue Tang, Jiayu Pan, Jintao Wang, Jian Song, Leandros Tassiulas,
- Abstract summary: We study a system in which a sensor forwards status updates to a receiver through an errorprone channel, while the receiver sends the transmission results back to the sensor via a reliable channel.
To evaluate the timeliness of the status information at the receiver, we use the Age of Information metric.
We propose a Robbins-Monro algorithm to solve this problem and demonstrate that the optimal threshold can be approximated almost surely.
- Score: 25.04993246830622
- License:
- Abstract: In this paper, we study a system in which a sensor forwards status updates to a receiver through an error-prone channel, while the receiver sends the transmission results back to the sensor via a reliable channel. Both channels are subject to random delays. To evaluate the timeliness of the status information at the receiver, we use the Age of Information (AoI) metric. The objective is to design a sampling policy that minimizes the expected time-average AoI, even when the channel statistics (e.g., delay distributions) are unknown. We first review the threshold structure of the optimal offline policy under known channel statistics and then reformulate the design of the online algorithm as a stochastic approximation problem. We propose a Robbins-Monro algorithm to solve this problem and demonstrate that the optimal threshold can be approximated almost surely. Moreover, we prove that the cumulative AoI regret of the online algorithm increases with rate $\mathcal{O}(\ln K)$, where $K$ is the number of successful transmissions. In addition, our algorithm is shown to be minimax order optimal, in the sense that for any online learning algorithm, the cumulative AoI regret up to the $K$-th successful transmissions grows with the rate at least $\Omega(\ln K)$ in the worst case delay distribution. Finally, we improve the stability of the proposed online learning algorithm through a momentum-based stochastic gradient descent algorithm. Simulation results validate the performance of our proposed algorithm.
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - Learning-augmented Online Minimization of Age of Information and
Transmission Costs [24.873041306990288]
We develop an online algorithm to minimize the sum of transmission and staleness costs, ensuring a worst-case performance guarantee.
While online algorithms are robust, they are usually overly conservative and may have a poor average performance in typical scenarios.
We show that our learning-augmented algorithm achieves both consistency and robustness.
arXiv Detail & Related papers (2024-03-05T01:06:25Z) - Posterior Sampling with Delayed Feedback for Reinforcement Learning with
Linear Function Approximation [62.969796245827006]
Delayed-PSVI is an optimistic value-based algorithm that explores the value function space via noise perturbation with posterior sampling.
We show our algorithm achieves $widetildeO(sqrtd3H3 T + d2H2 E[tau]$ worst-case regret in the presence of unknown delays.
We incorporate a gradient-based approximate sampling scheme via Langevin dynamics for Delayed-LPSVI.
arXiv Detail & Related papers (2023-10-29T06:12:43Z) - Provably Efficient Neural Offline Reinforcement Learning via Perturbed
Rewards [33.88533898709351]
VIPeR amalgamates the randomized value function idea with the pessimism principle.
It implicitly obtains pessimism by simply perturbing the offline data multiple times.
It is both provably and computationally efficient in general Markov decision processes (MDPs) with neural network function approximation.
arXiv Detail & Related papers (2023-02-24T17:52:12Z) - Optimal Algorithms for the Inhomogeneous Spiked Wigner Model [89.1371983413931]
We derive an approximate message-passing algorithm (AMP) for the inhomogeneous problem.
We identify in particular the existence of a statistical-to-computational gap where known algorithms require a signal-to-noise ratio bigger than the information-theoretic threshold to perform better than random.
arXiv Detail & Related papers (2023-02-13T19:57:17Z) - Online estimation and control with optimal pathlength regret [52.28457815067461]
A natural goal when designing online learning algorithms is to bound the regret of the algorithm in terms of the temporal variation of the input sequence.
Data-dependent "pathlength" regret bounds have recently been obtained for a wide variety of online learning problems, including OCO and bandits.
arXiv Detail & Related papers (2021-10-24T22:43:15Z) - Machine Learning for Online Algorithm Selection under Censored Feedback [71.6879432974126]
In online algorithm selection (OAS), instances of an algorithmic problem class are presented to an agent one after another, and the agent has to quickly select a presumably best algorithm from a fixed set of candidate algorithms.
For decision problems such as satisfiability (SAT), quality typically refers to the algorithm's runtime.
In this work, we revisit multi-armed bandit algorithms for OAS and discuss their capability of dealing with the problem.
We adapt them towards runtime-oriented losses, allowing for partially censored data while keeping a space- and time-complexity independent of the time horizon.
arXiv Detail & Related papers (2021-09-13T18:10:52Z) - Channel Estimation via Successive Denoising in MIMO OFDM Systems: A Reinforcement Learning Approach [23.57305243608369]
We present a frequency-domain denoising method based on a reinforcement learning framework.
Our algorithm provides a significant improvement over the practical least squares (LS) estimation method.
arXiv Detail & Related papers (2021-01-25T18:33:54Z) - Aging Bandits: Regret Analysis and Order-Optimal Learning Algorithm for
Wireless Networks with Stochastic Arrivals [24.897224064639406]
We consider a single-hop wireless network with sources transmitting time-sensitive information over unreliable channels.
At every time slot, the learning algorithm selects a single pair (source, channel) and the selected source attempts to transmit its packet via the selected channel.
The goal of the learning algorithm is to minimize the Age-of-Information (AoI) in the network over $T$ time slots.
arXiv Detail & Related papers (2020-12-16T00:58:26Z) - Debiased Off-Policy Evaluation for Recommendation Systems [8.63711086812655]
A/B tests are reliable, but are time- and money-consuming, and entail a risk of failure.
We develop an alternative method, which predicts the performance of algorithms given historical data.
Our method produces smaller mean squared errors than state-of-the-art methods.
arXiv Detail & Related papers (2020-02-20T02:30:02Z) - Channel Assignment in Uplink Wireless Communication using Machine
Learning Approach [54.012791474906514]
This letter investigates a channel assignment problem in uplink wireless communication systems.
Our goal is to maximize the sum rate of all users subject to integer channel assignment constraints.
Due to high computational complexity, machine learning approaches are employed to obtain computational efficient solutions.
arXiv Detail & Related papers (2020-01-12T15:54:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.