Computation Resource Allocation Solution in Recommender Systems
- URL: http://arxiv.org/abs/2103.02259v1
- Date: Wed, 3 Mar 2021 08:41:43 GMT
- Title: Computation Resource Allocation Solution in Recommender Systems
- Authors: Xun Yang, Yunli Wang, Cheng Chen, Qing Tan, Chuan Yu, Jian Xu,
Xiaoqiang Zhu
- Abstract summary: We propose a computation resource allocation solution (CRAS) that maximizes the business goal with limited computation resources and response time.
The effectiveness of our method is verified by extensive experiments based on the real dataset from Taobao.com.
- Score: 19.456109814747048
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Recommender systems rely heavily on increasing computation resources to
improve their business goal. By deploying computation-intensive models and
algorithms, these systems are able to inference user interests and exhibit
certain ads or commodities from the candidate set to maximize their business
goals. However, such systems are facing two challenges in achieving their
goals. On the one hand, facing massive online requests, computation-intensive
models and algorithms are pushing their computation resources to the limit. On
the other hand, the response time of these systems is strictly limited to a
short period, e.g. 300 milliseconds in our real system, which is also being
exhausted by the increasingly complex models and algorithms.
In this paper, we propose the computation resource allocation solution (CRAS)
that maximizes the business goal with limited computation resources and
response time. We comprehensively illustrate the problem and formulate such a
problem as an optimization problem with multiple constraints, which could be
broken down into independent sub-problems. To solve the sub-problems, we
propose the revenue function to facilitate the theoretical analysis, and obtain
the optimal computation resource allocation strategy. To address the
applicability issues, we devise the feedback control system to help our
strategy constantly adapt to the changing online environment. The effectiveness
of our method is verified by extensive experiments based on the real dataset
from Taobao.com. We also deploy our method in the display advertising system of
Alibaba. The online results show that our computation resource allocation
solution achieves significant business goal improvement without any increment
of computation cost, which demonstrates the efficacy of our method in real
industrial practice.
Related papers
- Reinforcement Learning for Adaptive Resource Scheduling in Complex System Environments [8.315191578007857]
This study presents a novel computer system performance optimization and adaptive workload management scheduling algorithm based on Q-learning.
By contrast, Q-learning, a reinforcement learning algorithm, continuously learns from system state changes, enabling dynamic scheduling and resource optimization.
This research provides a foundation for the integration of AI-driven adaptive scheduling in future large-scale systems, offering a scalable, intelligent solution to enhance system performance, reduce operating costs, and support sustainable energy consumption.
arXiv Detail & Related papers (2024-11-08T05:58:09Z) - DNN Partitioning, Task Offloading, and Resource Allocation in Dynamic Vehicular Networks: A Lyapunov-Guided Diffusion-Based Reinforcement Learning Approach [49.56404236394601]
We formulate the problem of joint DNN partitioning, task offloading, and resource allocation in Vehicular Edge Computing.
Our objective is to minimize the DNN-based task completion time while guaranteeing the system stability over time.
We propose a Multi-Agent Diffusion-based Deep Reinforcement Learning (MAD2RL) algorithm, incorporating the innovative use of diffusion models.
arXiv Detail & Related papers (2024-06-11T06:31:03Z) - Machine Learning Insides OptVerse AI Solver: Design Principles and
Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver.
We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem.
We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z) - Multi-Resource Allocation for On-Device Distributed Federated Learning
Systems [79.02994855744848]
This work poses a distributed multi-resource allocation scheme for minimizing the weighted sum of latency and energy consumption in the on-device distributed federated learning (FL) system.
Each mobile device in the system engages the model training process within the specified area and allocates its computation and communication resources for deriving and uploading parameters, respectively.
arXiv Detail & Related papers (2022-11-01T14:16:05Z) - Learning to Optimize Permutation Flow Shop Scheduling via Graph-based
Imitation Learning [70.65666982566655]
Permutation flow shop scheduling (PFSS) is widely used in manufacturing systems.
We propose to train the model via expert-driven imitation learning, which accelerates convergence more stably and accurately.
Our model's network parameters are reduced to only 37% of theirs, and the solution gap of our model towards the expert solutions decreases from 6.8% to 1.3% on average.
arXiv Detail & Related papers (2022-10-31T09:46:26Z) - Evolutionary Optimization for Proactive and Dynamic Computing Resource
Allocation in Open Radio Access Network [4.9711284100869815]
Intelligent techniques are urged to achieve automatic allocation of the computing resource in Open Radio Access Network (O-RAN)
Existing problem formulation to solve this resource allocation problem is unsuitable as it defines the capacity utility of resource in an inappropriate way.
New formulation that better describes the problem is proposed.
arXiv Detail & Related papers (2022-01-12T08:52:04Z) - Dynamic neighbourhood optimisation for task allocation using multi-agent [0.0]
In large-scale systems there are challenges when centralised techniques are used for task allocation.
This paper presents four algorithms to solve these problems.
It provides 5x better performance recovery over no-knowledge retention approaches when system connectivity is impacted.
arXiv Detail & Related papers (2021-02-16T17:49:14Z) - The Best of Many Worlds: Dual Mirror Descent for Online Allocation
Problems [7.433931244705934]
We consider a data-driven setting in which the reward and resource consumption of each request are generated using an input model unknown to the decision maker.
We design general class of algorithms that attain good performance in various input models without knowing which type of input they are facing.
Our algorithms operate in the Lagrangian dual space: they maintain a dual multiplier for each resource that is updated using online mirror descent.
arXiv Detail & Related papers (2020-11-18T18:39:17Z) - Resource Allocation via Model-Free Deep Learning in Free Space Optical
Communications [119.81868223344173]
The paper investigates the general problem of resource allocation for mitigating channel fading effects in Free Space Optical (FSO) communications.
Under this framework, we propose two algorithms that solve FSO resource allocation problems.
arXiv Detail & Related papers (2020-07-27T17:38:51Z) - Combining Deep Learning and Optimization for Security-Constrained
Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems.
Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs.
This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z) - Regularized Online Allocation Problems: Fairness and Beyond [7.433931244705934]
We introduce the emphregularized online allocation problem, a variant that includes a non-linear regularizer acting on the total resource consumption.
In this problem, requests repeatedly arrive over time and, for each request, a decision maker needs to take an action that generates a reward and consumes resources.
The objective is to simultaneously maximize additively separable rewards and the value of a non-separable regularizer subject to the resource constraints.
arXiv Detail & Related papers (2020-07-01T14:24:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.