Fiber: A Platform for Efficient Development and Distributed Training for
Reinforcement Learning and Population-Based Methods
- URL: http://arxiv.org/abs/2003.11164v1
- Date: Wed, 25 Mar 2020 00:28:48 GMT
- Title: Fiber: A Platform for Efficient Development and Distributed Training for
Reinforcement Learning and Population-Based Methods
- Authors: Jiale Zhi, Rui Wang, Jeff Clune, Kenneth O. Stanley
- Abstract summary: Reinforcement learning (RL) and population-based methods pose unique challenges for efficiency and flexibility to the underlying distributed computing frameworks.
We introduce Fiber, a scalable distributed computing framework for RL and population-based methods.
- Score: 15.263746839710715
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in machine learning are consistently enabled by increasing
amounts of computation. Reinforcement learning (RL) and population-based
methods in particular pose unique challenges for efficiency and flexibility to
the underlying distributed computing frameworks. These challenges include
frequent interaction with simulations, the need for dynamic scaling, and the
need for a user interface with low adoption cost and consistency across
different backends. In this paper we address these challenges while still
retaining development efficiency and flexibility for both research and
practical applications by introducing Fiber, a scalable distributed computing
framework for RL and population-based methods. Fiber aims to significantly
expand the accessibility of large-scale parallel computation to users of
otherwise complicated RL and population-based approaches without the need to
for specialized computational expertise.
Related papers
- Scaling DRL for Decision Making: A Survey on Data, Network, and Training Budget Strategies [66.83950068218033]
Scaling Laws demonstrate that scaling model parameters and training data enhances learning performance.<n>Despite its potential to improve performance, the integration of scaling laws into deep reinforcement learning has not been fully realized.<n>This review addresses this gap by systematically analyzing scaling strategies in three dimensions: data, network, and training budget.
arXiv Detail & Related papers (2025-08-05T08:03:12Z) - Deploying Large AI Models on Resource-Limited Devices with Split Federated Learning [39.73152182572741]
This paper proposes a novel framework, named Quantized Split Federated Fine-Tuning Large AI Model (SFLAM)
By partitioning the training load between edge devices and servers, SFLAM can facilitate the operation of large models on devices.
SFLAM incorporates quantization management, power control, and bandwidth allocation strategies to enhance training efficiency.
arXiv Detail & Related papers (2025-04-12T07:55:11Z) - eFedLLM: Efficient LLM Inference Based on Federated Learning [1.6179784294541053]
Large Language Models (LLMs) herald a transformative era in artificial intelligence (AI)
This paper introduces an effective approach that enhances the operational efficiency and affordability of LLM inference.
arXiv Detail & Related papers (2024-11-24T22:50:02Z) - Acceleration for Deep Reinforcement Learning using Parallel and Distributed Computing: A Survey [29.275928499337734]
Deep reinforcement learning has led to dramatic breakthroughs in the field of artificial intelligence.
As the amount of rollout experience data and the size of neural networks for deep reinforcement learning have grown, handling the training process and reducing the time consumption using parallel and distributed computing is becoming an urgent and essential desire.
We perform a broad and thorough investigation on training acceleration methodologies for deep reinforcement learning based on parallel and distributed computing.
arXiv Detail & Related papers (2024-11-08T14:55:32Z) - CollaFuse: Collaborative Diffusion Models [5.331052581441263]
We introduce a novel approach for distributed collaborative diffusion models inspired by split learning.
Our approach facilitates collaborative training of diffusion models while alleviating client computational burdens during image synthesis.
arXiv Detail & Related papers (2024-06-20T15:54:21Z) - PeersimGym: An Environment for Solving the Task Offloading Problem with Reinforcement Learning [2.0249250133493195]
We introduce PeersimGym, an open-source, customizable simulation environment tailored for developing and optimizing task offloading strategies within computational networks.
PeersimGym supports a wide range of network topologies and computational constraints and integrates a textitPettingZoo-based interface for RL agent deployment in both solo and multi-agent setups.
We demonstrate the utility of the environment through experiments with Deep Reinforcement Learning agents, showcasing the potential of RL-based approaches to significantly enhance offloading strategies in distributed computing settings.
arXiv Detail & Related papers (2024-03-26T12:12:44Z) - Machine Learning Insides OptVerse AI Solver: Design Principles and
Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver.
We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem.
We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z) - Distributionally Robust Model-based Reinforcement Learning with Large
State Spaces [55.14361269378122]
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment.
We study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets.
We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics.
arXiv Detail & Related papers (2023-09-05T13:42:11Z) - On Efficient Training of Large-Scale Deep Learning Models: A Literature
Review [90.87691246153612]
The field of deep learning has witnessed significant progress, particularly in computer vision (CV), natural language processing (NLP), and speech.
The use of large-scale models trained on vast amounts of data holds immense promise for practical applications.
With the increasing demands on computational capacity, a comprehensive summarization on acceleration techniques of training deep learning models is still much anticipated.
arXiv Detail & Related papers (2023-04-07T11:13:23Z) - Personalizing Federated Learning with Over-the-Air Computations [84.8089761800994]
Federated edge learning is a promising technology to deploy intelligence at the edge of wireless networks in a privacy-preserving manner.
Under such a setting, multiple clients collaboratively train a global generic model under the coordination of an edge server.
This paper presents a distributed training paradigm that employs analog over-the-air computation to address the communication bottleneck.
arXiv Detail & Related papers (2023-02-24T08:41:19Z) - Privacy-Preserving Serverless Edge Learning with Decentralized Small
Data [13.254530176359182]
Distributed training strategies have recently become a promising approach to ensure data privacy when training deep models.
This paper extends conventional serverless platforms with serverless edge learning architectures and provides an efficient distributed training framework from the networking perspective.
arXiv Detail & Related papers (2021-11-29T21:04:49Z) - Edge-assisted Democratized Learning Towards Federated Analytics [67.44078999945722]
We show the hierarchical learning structure of the proposed edge-assisted democratized learning mechanism, namely Edge-DemLearn.
We also validate Edge-DemLearn as a flexible model training mechanism to build a distributed control and aggregation methodology in regions.
arXiv Detail & Related papers (2020-12-01T11:46:03Z) - Adaptive Serverless Learning [114.36410688552579]
We propose a novel adaptive decentralized training approach, which can compute the learning rate from data dynamically.
Our theoretical results reveal that the proposed algorithm can achieve linear speedup with respect to the number of workers.
To reduce the communication-efficient overhead, we further propose a communication-efficient adaptive decentralized training approach.
arXiv Detail & Related papers (2020-08-24T13:23:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.