Hyperparameters in Contextual RL are Highly Situational
        - URL: http://arxiv.org/abs/2212.10876v1
- Date: Wed, 21 Dec 2022 09:38:18 GMT
- Title: Hyperparameters in Contextual RL are Highly Situational
- Authors: Theresa Eimer, Carolin Benjamins, Marius Lindauer
- Abstract summary: Reinforcement Learning (RL) has shown impressive results in games and simulation, but real-world application suffers from its instability under changing environment conditions.
We show that the hyper parameters found by HPO methods are not only dependent on the problem at hand, but even on how well the state describes the environment dynamics.
- Score: 16.328866317851183
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   Although Reinforcement Learning (RL) has shown impressive results in games
and simulation, real-world application of RL suffers from its instability under
changing environment conditions and hyperparameters. We give a first impression
of the extent of this instability by showing that the hyperparameters found by
automatic hyperparameter optimization (HPO) methods are not only dependent on
the problem at hand, but even on how well the state describes the environment
dynamics. Specifically, we show that agents in contextual RL require different
hyperparameters if they are shown how environmental factors change. In
addition, finding adequate hyperparameter configurations is not equally easy
for both settings, further highlighting the need for research into how
hyperparameters influence learning and generalization in RL.
 
      
        Related papers
        - Continual Adaptation: Environment-Conditional Parameter Generation for   Object Detection in Dynamic Scenarios [54.58186816693791]
 environments constantly change over time and space, posing significant challenges for object detectors trained based on a closed-set assumption.<n>We propose a new mechanism, converting the fine-tuning process to a specific- parameter generation.<n>In particular, we first design a dual-path LoRA-based domain-aware adapter that disentangles features into domain-invariant and domain-specific components.
 arXiv  Detail & Related papers  (2025-06-30T17:14:12Z)
- Hyperparameter Optimisation with Practical Interpretability and   Explanation Methods in Probabilistic Curriculum Learning [2.5352713493505785]
 Probabilistic Curriculum Learning (PCL) is a curriculum learning strategy designed to improve RL performance by structuring the agent's learning process.
We provide an empirical analysis of hyperparameter interactions and their effects on the performance of a PCL algorithm within standard RL tasks.
 arXiv  Detail & Related papers  (2025-04-09T08:41:27Z)
- Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large   Language Model Pretraining [56.58170370127227]
 We show that optimal learning rate follows a power-law relationship with both model parameters and data sizes, while optimal batch size scales primarily with data sizes.
This work is the first work that unifies different model shapes and structures, such as Mixture-of-Experts models and dense transformers.
 arXiv  Detail & Related papers  (2025-03-06T18:58:29Z)
- Hyper: Hyperparameter Robust Efficient Exploration in Reinforcement   Learning [48.81121647322492]
 textbfHyper is provably efficient under function approximation setting and empirically demonstrate its appealing performance and robustness in various environments.
textbfHyper extensively mitigates the problem by effectively regularizing the visitation of the exploration and decoupling the exploitation to ensure stable training.
 arXiv  Detail & Related papers  (2024-12-04T23:12:41Z)
- Scaling Exponents Across Parameterizations and Optimizers [94.54718325264218]
 We propose a new perspective on parameterization by investigating a key assumption in prior work.
Our empirical investigation includes tens of thousands of models trained with all combinations of threes.
We find that the best learning rate scaling prescription would often have been excluded by the assumptions in prior work.
 arXiv  Detail & Related papers  (2024-07-08T12:32:51Z)
- ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane   Reflections [59.839926875976225]
 We propose the ETHER transformation family, which performs Efficient fineTuning via HypErplane Reflections.
In particular, we introduce ETHER and its relaxation ETHER+, which match or outperform existing PEFT methods with significantly fewer parameters.
 arXiv  Detail & Related papers  (2024-05-30T17:26:02Z)
- AutoRL Hyperparameter Landscapes [69.15927869840918]
 Reinforcement Learning (RL) has shown to be capable of producing impressive results, but its use is limited by the impact of its hyperparameters on performance.
We propose an approach to build and analyze these hyperparameter landscapes not just for one point in time but at multiple points in time throughout training.
This supports the theory that hyperparameters should be dynamically adjusted during training and shows the potential for more insights on AutoRL problems that can be gained through landscape analyses.
 arXiv  Detail & Related papers  (2023-04-05T12:14:41Z)
- A Framework for History-Aware Hyperparameter Optimisation in
  Reinforcement Learning [8.659973888018781]
 A Reinforcement Learning (RL) system depends on a set of initial conditions that affect the system's performance.
We propose a framework based on integrating complex event processing and temporal models, to alleviate these trade-offs.
We tested the proposed approach in a 5G mobile communications case study that uses DQN, a variant of RL, for its decision-making.
 arXiv  Detail & Related papers  (2023-03-09T11:30:40Z)
- No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL [28.31529154045046]
 We propose a new approach to tune hyperparameters from offline logs of data.
We first learn a model of the environment from the offline data, which we call a calibration model, and then simulate learning in the calibration model.
We empirically investigate the method in a variety of settings to identify when it is effective and when it fails.
 arXiv  Detail & Related papers  (2022-05-18T04:26:23Z)
- AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient
  Hyper-parameter Tuning [72.54359545547904]
 We propose a gradient-based subset selection framework for hyper- parameter tuning.
We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
 arXiv  Detail & Related papers  (2022-03-15T19:25:01Z)
- Online hyperparameter optimization by real-time recurrent learning [57.01871583756586]
 Our framework takes advantage of the analogy between hyperparameter optimization and parameter learning in neural networks (RNNs)
It adapts a well-studied family of online learning algorithms for RNNs to tune hyperparameters and network parameters simultaneously.
This procedure yields systematically better generalization performance compared to standard methods, at a fraction of wallclock time.
 arXiv  Detail & Related papers  (2021-02-15T19:36:18Z)
- Sample-Efficient Automated Deep Reinforcement Learning [33.53903358611521]
 We propose a population-based automated RL framework to meta-optimize arbitrary off-policy RL algorithms.
By sharing the collected experience across the population, we substantially increase the sample efficiency of the meta-optimization.
We demonstrate the capabilities of our sample-efficient AutoRL approach in a case study with the popular TD3 algorithm in the MuJoCo benchmark suite.
 arXiv  Detail & Related papers  (2020-09-03T10:04:06Z)
- Hyperparameter Selection for Offline Reinforcement Learning [61.92834684647419]
 offline reinforcement learning (RL purely from logged data) is an important avenue for deploying RL techniques in real-world scenarios.
Existing hyperparameter selection methods for offline RL break the offline assumption.
 arXiv  Detail & Related papers  (2020-07-17T15:30:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.