Related papers: DreamShard: Generalizable Embedding Table Placement for Recommender Systems

DreamShard: Generalizable Embedding Table Placement for Recommender Systems

URL: http://arxiv.org/abs/2210.02023v1
Date: Wed, 5 Oct 2022 05:12:02 GMT
Title: DreamShard: Generalizable Embedding Table Placement for Recommender Systems
Authors: Daochen Zha, Louis Feng, Qiaoyu Tan, Zirui Liu, Kwei-Herng Lai, Bhargav Bhushanam, Yuandong Tian, Arun Kejariwal, Xia Hu
Abstract summary: We present a reinforcement learning (RL) approach for embedding table placement. DreamShard achieves the reasoning of operation fusion and generalizability. Experiments show that DreamShard substantially outperforms the existing human expert and RNN-based strategies.
Score: 62.444159500899566
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study embedding table placement for distributed recommender systems, which aims to partition and place the tables on multiple hardware devices (e.g., GPUs) to balance the computation and communication costs. Although prior work has explored learning-based approaches for the device placement of computational graphs, embedding table placement remains to be a challenging problem because of 1) the operation fusion of embedding tables, and 2) the generalizability requirement on unseen placement tasks with different numbers of tables and/or devices. To this end, we present DreamShard, a reinforcement learning (RL) approach for embedding table placement. DreamShard achieves the reasoning of operation fusion and generalizability with 1) a cost network to directly predict the costs of the fused operation, and 2) a policy network that is efficiently trained on an estimated Markov decision process (MDP) without real GPU execution, where the states and the rewards are estimated with the cost network. Equipped with sum and max representation reductions, the two networks can directly generalize to any unseen tasks with different numbers of tables and/or devices without fine-tuning. Extensive experiments show that DreamShard substantially outperforms the existing human expert and RNN-based strategies with up to 19% speedup over the strongest baseline on large-scale synthetic tables and our production tables. The code is available at https://github.com/daochenzha/dreamshard

Related papers

GTR: Graph-Table-RAG for Cross-Table Question Answering [53.11230952572134]
We propose the first Graph-Table-RAG framework, namely GTR, which reorganizes table corpora into a heterogeneous graph. GTR exhibits superior cross-table question-answering performance while maintaining high deployment efficiency, demonstrating its real-world practical applicability.
arXiv Detail & Related papers (2025-04-02T04:24:41Z)
FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression [55.992528247880685]
Decentralized training faces significant challenges regarding system design and efficiency. We present FusionLLM, a decentralized training system designed and implemented for training large deep neural networks (DNNs) We show that our system and method can achieve 1.45 - 9.39x speedup compared to baseline methods while ensuring convergence.
arXiv Detail & Related papers (2024-10-16T16:13:19Z)
TableRAG: Million-Token Table Understanding with Language Models [53.039560091592215]
TableRAG is a Retrieval-Augmented Generation (RAG) framework specifically designed for LM-based table understanding. TableRAG leverages query expansion combined with schema and cell retrieval to pinpoint crucial information before providing it to the LMs. Our results demonstrate that TableRAG achieves the highest retrieval quality, leading to the new state-of-the-art performance on large-scale table understanding.
arXiv Detail & Related papers (2024-10-07T04:15:02Z)
On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability [59.72892401927283]
We evaluate the planning capabilities of OpenAI's o1 models across a variety of benchmark tasks. Our results reveal that o1-preview outperforms GPT-4 in adhering to task constraints.
arXiv Detail & Related papers (2024-09-30T03:58:43Z)
TablePuppet: A Generic Framework for Relational Federated Learning [27.274856376963356]
Current federated learning (FL) approaches view decentralized training data as a single table, divided among participants either horizontally (by rows) or vertically (by columns) This scenario requires intricate operations like joins and unions to obtain the training data, which is either costly or restricted by privacy concerns. We propose TablePuppet, a generic framework for RFL that decomposes the learning process into two steps: (1) learning over join (LoJ) followed by (2) learning over union (LoU)
arXiv Detail & Related papers (2024-03-23T13:28:37Z)
Stochastic Configuration Machines: FPGA Implementation [4.57421617811378]
configuration networks (SCNs) are a prime choice in industrial applications due to their merits and feasibility for data modelling. This paper aims to implement SCM models on a field programmable gate array (FPGA) and introduce binary-coded inputs to improve learning performance.
arXiv Detail & Related papers (2023-10-30T02:04:20Z)
Pre-train and Search: Efficient Embedding Table Sharding with Pre-trained Neural Cost Models [56.65200574282804]
We propose a "pre-train, and search" paradigm for efficient sharding. NeuroShard pre-trains neural cost models on augmented tables to cover various sharding scenarios. NeuroShard significantly and consistently outperforms the state-of-the-art on the benchmark sharding dataset.
arXiv Detail & Related papers (2023-05-03T02:52:03Z)
AutoShard: Automated Embedding Table Sharding for Recommender Systems [54.82606459574231]
We introduce our novel practice in Meta, namely AutoShard, which uses a neural cost model to directly predict the multi-table costs. AutoShard can efficiently shard hundreds of tables in seconds. Our algorithms have been deployed in Meta production environment.
arXiv Detail & Related papers (2022-08-12T17:48:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.