Related papers: Learning in Strategic Queuing Systems with Small Buffers

Learning in Strategic Queuing Systems with Small Buffers

URL: http://arxiv.org/abs/2502.08898v1
Date: Thu, 13 Feb 2025 02:23:23 GMT
Title: Learning in Strategic Queuing Systems with Small Buffers
Authors: Ariana Abel, Yoav Kolumbus, Jeronimo Martin Duque, Eva Tardos,
Abstract summary: We show that when queues are learning, a small constant factor increase in server capacity, compared to what would be needed if centrally coordinating, suffices to keep the system stable.<n>This work contributes to the growing literature on the impact of selfish learning in systems with carryover effects between rounds.
Score: 3.6480791907166306
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Routers in networking use simple learning algorithms to find the best way to deliver packets to their desired destination. This simple, myopic and distributed decision system makes large queuing systems simple to operate, but at the same time, the system needs more capacity than would be required if all traffic were centrally coordinated. In a recent paper, Gaitonde and Tardos (EC 2020 and JACM 2023) initiate the study of such systems, modeling them as an infinitely repeated game in which routers compete for servers and the system maintains a state (number of packets held by each queue) resulting from outcomes of previous rounds. Queues get to send a packet at each step to one of the servers, and servers attempt to process only one of the arriving packets, modeling routers. However, their model assumes that servers have no buffers at all, so queues have to resend all packets that were not served successfully. They show that, even with hugely increased server capacity relative to what is needed in the centrally-coordinated case, ensuring that the system is stable requires using timestamps and priority for older packets. We consider a system with two important changes, which make the model more realistic: first we add a very small buffer to each server, allowing it to hold on to a single packet to be served later (even if it fails to serve it); and second, we do not require timestamps or priority for older packets. Our main result is to show that when queues are learning, a small constant factor increase in server capacity, compared to what would be needed if centrally coordinating, suffices to keep the system stable, even if servers select randomly among packets arriving simultaneously. This work contributes to the growing literature on the impact of selfish learning in systems with carryover effects between rounds: when outcomes in the present round affect the game in the future.

Related papers

CycleSL: Server-Client Cyclical Update Driven Scalable Split Learning [60.59553507555341]
We introduce CycleSL, a novel aggregation-free split learning framework.<n>Inspired by alternating block coordinate descent, CycleSL treats server-side training as an independent higher-level machine learning task.<n>Our empirical findings highlight the effectiveness of CycleSL in enhancing model performance.
arXiv Detail & Related papers (2025-11-23T21:00:21Z)
SeSeMI: Secure Serverless Model Inference on Sensitive Data [14.820151992047089]
Existing cloud-based model inference systems are costly, not easy to scale, and must be trusted in handling the models and user request data.<n>Our goal is to design a serverless model inference system that protects models and user request data from untrusted cloud providers.<n>We present SeSeMI, a secure, efficient, and cost-effective serverless model inference system.
arXiv Detail & Related papers (2024-12-16T10:37:30Z)
Asynchronous Multi-Server Federated Learning for Geo-Distributed Clients [4.6792910030704515]
Federated learning (FL) systems enable multiple clients to train a machine learning model iteratively through synchronously exchanging the intermediate model weights with a single server. The scalability of such FL systems can be limited by two factors: server idle time due to synchronous communication and the risk of a single server becoming the bottleneck. We propose a new FL architecture that is entirely asynchronous, and therefore addresses these two limitations simultaneously.
arXiv Detail & Related papers (2024-06-03T15:29:46Z)
Communication Efficient ConFederated Learning: An Event-Triggered SAGA Approach [67.27031215756121]
Federated learning (FL) is a machine learning paradigm that targets model training without gathering the local data over various data sources. Standard FL, which employs a single server, can only support a limited number of users, leading to degraded learning capability. In this work, we consider a multi-server FL framework, referred to as emphConfederated Learning (CFL) in order to accommodate a larger number of users.
arXiv Detail & Related papers (2024-02-28T03:27:10Z)
RelayAttention for Efficient Large Language Model Serving with Long System Prompts [59.50256661158862]
This paper aims to improve the efficiency of LLM services that involve long system prompts. handling these system prompts requires heavily redundant memory accesses in existing causal attention algorithms. We propose RelayAttention, an attention algorithm that allows reading hidden states from DRAM exactly once for a batch of input tokens.
arXiv Detail & Related papers (2024-02-22T18:58:28Z)
Efficient Reinforcement Learning for Routing Jobs in Heterogeneous Queueing Systems [21.944723061337267]
We consider the problem of efficiently routing jobs that arrive into a central queue to a system of heterogeneous servers. Unlike homogeneous systems, a threshold policy, that routes jobs to the slow server(s) when the queue length exceeds a certain threshold, is known to be optimal for the one-fast-one-slow two-server system. We propose ACHQ, an efficient policy gradient based algorithm with a low dimensional soft threshold policy parameterization.
arXiv Detail & Related papers (2024-02-02T05:22:41Z)
Optimizing Server-side Aggregation For Robust Federated Learning via Subspace Training [80.03567604524268]
Non-IID data distribution across clients and poisoning attacks are two main challenges in real-world federated learning systems. We propose SmartFL, a generic approach that optimize the server-side aggregation process. We provide theoretical analyses of the convergence and generalization capacity for SmartFL.
arXiv Detail & Related papers (2022-11-10T13:20:56Z)
Learning While Scheduling in Multi-Server Systems with Unknown Statistics: MaxWeight with Discounted UCB [18.898514227870926]
This paper considers a multi-server system with multiple servers and multiple types of jobs, where different job types require different amounts of processing time at different servers. The goal is to schedule jobs on servers without knowing the statistics of the processing times. We propose a new algorithm, which combines the MaxWeight scheduling policy with discounted upper confidence bound (UCB) to simultaneously learn statistics and schedule jobs to servers.
arXiv Detail & Related papers (2022-09-02T15:37:02Z)
Better than the Best: Gradient-based Improper Reinforcement Learning for Network Scheduling [60.48359567964899]
We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay. We use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies.
arXiv Detail & Related papers (2021-05-01T10:18:34Z)
Tailored Learning-Based Scheduling for Kubernetes-Oriented Edge-Cloud System [54.588242387136376]
We introduce KaiS, a learning-based scheduling framework for edge-cloud systems. First, we design a coordinated multi-agent actor-critic algorithm to cater to decentralized request dispatch. Second, for diverse system scales and structures, we use graph neural networks to embed system state information. Third, we adopt a two-time-scale scheduling mechanism to harmonize request dispatch and service orchestration.
arXiv Detail & Related papers (2021-01-17T03:45:25Z)
Domain-specific Communication Optimization for Distributed DNN Training [10.781867496460837]
We present DLCP, a novel solution exploiting the domain-specific properties of deep learning to optimize communication overhead of DNN training in a fine-grained manner. It exploits em bounded loss tolerance of SGD-based training to improve tail communication latency which cannot be avoided purely through gradient compression. It then performs fine-grained packet-level prioritization and dropping, as opposed to flow-level scheduling, based on layers and magnitudes of gradients to further speedup model convergence without affecting accuracy.
arXiv Detail & Related papers (2020-08-16T09:53:21Z)
Superiority of Simplicity: A Lightweight Model for Network Device Workload Prediction [58.98112070128482]
We propose a lightweight solution for series prediction based on historic observations. It consists of a heterogeneous ensemble method composed of two models - a neural network and a mean predictor. It achieves an overall $R2$ score of 0.10 on the available FedCSIS 2020 challenge dataset.
arXiv Detail & Related papers (2020-07-07T15:44:16Z)
Stability and Learning in Strategic Queuing Systems [0.0]
We study the phenomenon in the context of a game modeling queuing systems. routers compete for servers, where packets that do not get service will be resent at future rounds. This paper is the first to study the effect of selfish learning in a queuing system.
arXiv Detail & Related papers (2020-03-16T03:59:00Z)
Joint Parameter-and-Bandwidth Allocation for Improving the Efficiency of Partitioned Edge Learning [73.82875010696849]
Machine learning algorithms are deployed at the network edge for training artificial intelligence (AI) models. This paper focuses on the novel joint design of parameter (computation load) allocation and bandwidth allocation.
arXiv Detail & Related papers (2020-03-10T05:52:15Z)
Taurus: A Data Plane Architecture for Per-Packet ML [59.1343317736213]
We present the design and implementation of Taurus, a data plane for line-rate inference. Our evaluation of a Taurus switch ASIC shows that Taurus operates orders of magnitude faster than a server-based control plane.
arXiv Detail & Related papers (2020-02-12T09:18:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.