Related papers: Age Optimal Sampling for Unreliable Channels under Unknown Channel Statistics

Age Optimal Sampling for Unreliable Channels under Unknown Channel Statistics

URL: http://arxiv.org/abs/2412.18119v1
Date: Tue, 24 Dec 2024 03:06:22 GMT
Title: Age Optimal Sampling for Unreliable Channels under Unknown Channel Statistics
Authors: Hongyi He, Haoyue Tang, Jiayu Pan, Jintao Wang, Jian Song, Leandros Tassiulas,
Abstract summary: We study a system in which a sensor forwards status updates to a receiver through an errorprone channel, while the receiver sends the transmission results back to the sensor via a reliable channel.<n>To evaluate the timeliness of the status information at the receiver, we use the Age of Information metric.<n>We propose a Robbins-Monro algorithm to solve this problem and demonstrate that the optimal threshold can be approximated almost surely.
Score: 25.04993246830622
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: In this paper, we study a system in which a sensor forwards status updates to a receiver through an error-prone channel, while the receiver sends the transmission results back to the sensor via a reliable channel. Both channels are subject to random delays. To evaluate the timeliness of the status information at the receiver, we use the Age of Information (AoI) metric. The objective is to design a sampling policy that minimizes the expected time-average AoI, even when the channel statistics (e.g., delay distributions) are unknown. We first review the threshold structure of the optimal offline policy under known channel statistics and then reformulate the design of the online algorithm as a stochastic approximation problem. We propose a Robbins-Monro algorithm to solve this problem and demonstrate that the optimal threshold can be approximated almost surely. Moreover, we prove that the cumulative AoI regret of the online algorithm increases with rate $\mathcal{O}(\ln K)$, where $K$ is the number of successful transmissions. In addition, our algorithm is shown to be minimax order optimal, in the sense that for any online learning algorithm, the cumulative AoI regret up to the $K$-th successful transmissions grows with the rate at least $\Omega(\ln K)$ in the worst case delay distribution. Finally, we improve the stability of the proposed online learning algorithm through a momentum-based stochastic gradient descent algorithm. Simulation results validate the performance of our proposed algorithm.

Related papers

Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems. Such problems are encountered in medicine, physics, and machine learning. We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z)
Semantic-Aware Remote Estimation of Multiple Markov Sources Under Constraints [9.514904359788156]
We exploit the emphsemantics of information and consider that the remote actuator has different tolerances for the estimation errors.<n>We find an optimal scheduling policy that minimizes the long-term textitstate-dependent costs of estimation errors under a transmission frequency constraint.
arXiv Detail & Related papers (2024-03-25T15:18:23Z)
Learning-augmented Online Minimization of Age of Information and Transmission Costs [24.873041306990288]
We develop an online algorithm to minimize the sum of transmission and staleness costs, ensuring a worst-case performance guarantee. While online algorithms are robust, they are usually overly conservative and may have a poor average performance in typical scenarios. We show that our learning-augmented algorithm achieves both consistency and robustness.
arXiv Detail & Related papers (2024-03-05T01:06:25Z)
Posterior Sampling with Delayed Feedback for Reinforcement Learning with Linear Function Approximation [62.969796245827006]
Delayed-PSVI is an optimistic value-based algorithm that explores the value function space via noise perturbation with posterior sampling. We show our algorithm achieves $widetildeO(sqrtd3H3 T + d2H2 E[tau]$ worst-case regret in the presence of unknown delays. We incorporate a gradient-based approximate sampling scheme via Langevin dynamics for Delayed-LPSVI.
arXiv Detail & Related papers (2023-10-29T06:12:43Z)
Non-stationary Delayed Online Convex Optimization: From Full-information to Bandit Setting [71.82716109461967]
We propose an algorithm called Mild-OGD for the full-information case, where delayed gradients are available.<n>We show that the dynamic regret of Mild-OGD can be automatically bounded by $O(sqrtbardT(P_T+1))$ under the in-order assumption.<n>We also develop a bandit variant of Mild-OGD for a more challenging case with only delayed loss values.
arXiv Detail & Related papers (2023-05-20T07:54:07Z)
Provably Efficient Neural Offline Reinforcement Learning via Perturbed Rewards [33.88533898709351]
VIPeR amalgamates the randomized value function idea with the pessimism principle. It implicitly obtains pessimism by simply perturbing the offline data multiple times. It is both provably and computationally efficient in general Markov decision processes (MDPs) with neural network function approximation.
arXiv Detail & Related papers (2023-02-24T17:52:12Z)
Optimal Algorithms for the Inhomogeneous Spiked Wigner Model [89.1371983413931]
We derive an approximate message-passing algorithm (AMP) for the inhomogeneous problem. We identify in particular the existence of a statistical-to-computational gap where known algorithms require a signal-to-noise ratio bigger than the information-theoretic threshold to perform better than random.
arXiv Detail & Related papers (2023-02-13T19:57:17Z)
Event-Triggered Time-Varying Bayesian Optimization [47.30677525394649]
We propose an event-triggered algorithm that treats the optimization problem as static until it detects changes in the objective function and then resets the dataset. This allows the algorithm to adapt online to realized temporal changes without the need for exact prior knowledge. We derive regret bounds of adaptive resets without exact prior knowledge on the temporal changes, and show in numerical experiments that ET-GP-UCB outperforms state-of-the-art algorithms on both synthetic and real-world data.
arXiv Detail & Related papers (2022-08-23T07:50:52Z)
Online estimation and control with optimal pathlength regret [52.28457815067461]
A natural goal when designing online learning algorithms is to bound the regret of the algorithm in terms of the temporal variation of the input sequence. Data-dependent "pathlength" regret bounds have recently been obtained for a wide variety of online learning problems, including OCO and bandits.
arXiv Detail & Related papers (2021-10-24T22:43:15Z)
Machine Learning for Online Algorithm Selection under Censored Feedback [71.6879432974126]
In online algorithm selection (OAS), instances of an algorithmic problem class are presented to an agent one after another, and the agent has to quickly select a presumably best algorithm from a fixed set of candidate algorithms. For decision problems such as satisfiability (SAT), quality typically refers to the algorithm's runtime. In this work, we revisit multi-armed bandit algorithms for OAS and discuss their capability of dealing with the problem. We adapt them towards runtime-oriented losses, allowing for partially censored data while keeping a space- and time-complexity independent of the time horizon.
arXiv Detail & Related papers (2021-09-13T18:10:52Z)
Channel Estimation via Successive Denoising in MIMO OFDM Systems: A Reinforcement Learning Approach [23.57305243608369]
We present a frequency-domain denoising method based on a reinforcement learning framework. Our algorithm provides a significant improvement over the practical least squares (LS) estimation method.
arXiv Detail & Related papers (2021-01-25T18:33:54Z)
Aging Bandits: Regret Analysis and Order-Optimal Learning Algorithm for Wireless Networks with Stochastic Arrivals [24.897224064639406]
We consider a single-hop wireless network with sources transmitting time-sensitive information over unreliable channels. At every time slot, the learning algorithm selects a single pair (source, channel) and the selected source attempts to transmit its packet via the selected channel. The goal of the learning algorithm is to minimize the Age-of-Information (AoI) in the network over $T$ time slots.
arXiv Detail & Related papers (2020-12-16T00:58:26Z)
Debiased Off-Policy Evaluation for Recommendation Systems [8.63711086812655]
A/B tests are reliable, but are time- and money-consuming, and entail a risk of failure. We develop an alternative method, which predicts the performance of algorithms given historical data. Our method produces smaller mean squared errors than state-of-the-art methods.
arXiv Detail & Related papers (2020-02-20T02:30:02Z)
Channel Assignment in Uplink Wireless Communication using Machine Learning Approach [54.012791474906514]
This letter investigates a channel assignment problem in uplink wireless communication systems. Our goal is to maximize the sum rate of all users subject to integer channel assignment constraints. Due to high computational complexity, machine learning approaches are employed to obtain computational efficient solutions.
arXiv Detail & Related papers (2020-01-12T15:54:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.