Learning Computational Efficient Bots with Costly Features
- URL: http://arxiv.org/abs/2308.09629v1
- Date: Fri, 18 Aug 2023 15:43:31 GMT
- Title: Learning Computational Efficient Bots with Costly Features
- Authors: Anthony Kobanda, Valliappan C.A., Joshua Romoff, Ludovic Denoyer
- Abstract summary: We propose a generic offline learning approach where the computation cost of the input features is taken into account.
We demonstrate the effectiveness of our method on several tasks, including D4RL benchmarks and complex 3D environments similar to those found in video games.
- Score: 9.39143793228343
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep reinforcement learning (DRL) techniques have become increasingly used in
various fields for decision-making processes. However, a challenge that often
arises is the trade-off between both the computational efficiency of the
decision-making process and the ability of the learned agent to solve a
particular task. This is particularly critical in real-time settings such as
video games where the agent needs to take relevant decisions at a very high
frequency, with a very limited inference time.
In this work, we propose a generic offline learning approach where the
computation cost of the input features is taken into account. We derive the
Budgeted Decision Transformer as an extension of the Decision Transformer that
incorporates cost constraints to limit its cost at inference. As a result, the
model can dynamically choose the best input features at each timestep. We
demonstrate the effectiveness of our method on several tasks, including D4RL
benchmarks and complex 3D environments similar to those found in video games,
and show that it can achieve similar performance while using significantly
fewer computational resources compared to classical approaches.
Related papers
- Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization [50.485788083202124]
Reinforcement Learning (RL) plays a crucial role in aligning large language models with human preferences and improving their ability to perform complex tasks.
We introduce Direct Q-function Optimization (DQO), which formulates the response generation process as a Markov Decision Process (MDP) and utilizes the soft actor-critic (SAC) framework to optimize a Q-function directly parameterized by the language model.
Experimental results on two math problem-solving datasets, GSM8K and MATH, demonstrate that DQO outperforms previous methods, establishing it as a promising offline reinforcement learning approach for aligning language models.
arXiv Detail & Related papers (2024-10-11T23:29:20Z) - Memory-Enhanced Neural Solvers for Efficient Adaptation in Combinatorial Optimization [6.713974813995327]
We present MEMENTO, an approach that leverages memory to improve the adaptation of neural solvers at time.
We successfully train all RL auto-regressive solvers on large instances, and show that MEMENTO can scale and is data-efficient.
Overall, MEMENTO enables to push the state-of-the-art on 11 out of 12 evaluated tasks.
arXiv Detail & Related papers (2024-06-24T08:18:19Z) - Switchable Decision: Dynamic Neural Generation Networks [98.61113699324429]
We propose a switchable decision to accelerate inference by dynamically assigning resources for each data instance.
Our method benefits from less cost during inference while keeping the same accuracy.
arXiv Detail & Related papers (2024-05-07T17:44:54Z) - Action-Quantized Offline Reinforcement Learning for Robotic Skill
Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data.
In this paper, we propose an adaptive scheme for action quantization.
We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z) - Multi-Resolution Active Learning of Fourier Neural Operators [33.63483360957646]
We propose Multi-Resolution Active learning of FNO (MRA-FNO), which can dynamically select the input functions and resolutions to lower the data cost as much as possible.
Specifically, we propose a probabilistic multi-resolution FNO and use ensemble Monte-Carlo to develop an effective posterior inference algorithm.
We have shown the advantage of our method in several benchmark operator learning tasks.
arXiv Detail & Related papers (2023-09-29T04:41:27Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - The Statistical Complexity of Interactive Decision Making [126.04974881555094]
We provide a complexity measure, the Decision-Estimation Coefficient, that is proven to be both necessary and sufficient for sample-efficient interactive learning.
A unified algorithm design principle, Estimation-to-Decisions (E2D), transforms any algorithm for supervised estimation into an online algorithm for decision making.
arXiv Detail & Related papers (2021-12-27T02:53:44Z) - An Experimental Design Perspective on Model-Based Reinforcement Learning [73.37942845983417]
In practical applications of RL, it is expensive to observe state transitions from the environment.
We propose an acquisition function that quantifies how much information a state-action pair would provide about the optimal solution to a Markov decision process.
arXiv Detail & Related papers (2021-12-09T23:13:57Z) - Cost-Effective Federated Learning Design [37.16466118235272]
Federated learning (FL) is a distributed learning paradigm that enables a large number of devices to collaboratively learn a model without sharing their raw data.
Despite its efficiency and effectiveness, the iterative on-device learning process incurs a considerable cost in terms of learning time and energy consumption.
We analyze how to design adaptive FL that optimally chooses essential control variables to minimize the total cost while ensuring convergence.
arXiv Detail & Related papers (2020-12-15T14:45:11Z) - Deep Multi-Fidelity Active Learning of High-dimensional Outputs [17.370056935194786]
We develop a deep neural network-based multi-fidelity model for learning with high-dimensional outputs.
We then propose a mutual information-based acquisition function that extends the predictive entropy principle.
We show the advantage of our method in several applications of computational physics and engineering design.
arXiv Detail & Related papers (2020-12-02T00:02:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.