Evolving choice hysteresis in reinforcement learning: comparing the adaptive value of positivity bias and gradual perseveration
- URL: http://arxiv.org/abs/2410.19434v1
- Date: Fri, 25 Oct 2024 09:47:31 GMT
- Title: Evolving choice hysteresis in reinforcement learning: comparing the adaptive value of positivity bias and gradual perseveration
- Authors: Isabelle Hoxha, Leo Sperber, Stefano Palminteri,
- Abstract summary: We show that positivity bias is evolutionary stable in many situations, while the emergence of gradual perseveration is less systematic and robust.
Our results illustrate that biases can be adaptive and selected by evolution, in an environment-specific manner.
- Score: 0.0
- License:
- Abstract: The tendency of repeating past choices more often than expected from the history of outcomes has been repeatedly empirically observed in reinforcement learning experiments. It can be explained by at least two computational processes: asymmetric update and (gradual) choice perseveration. A recent meta-analysis showed that both mechanisms are detectable in human reinforcement learning. However, while their descriptive value seems to be well established, they have not been compared regarding their possible adaptive value. In this study, we address this gap by simulating reinforcement learning agents in a variety of environments with a new variant of an evolutionary algorithm. Our results show that positivity bias (in the form of asymmetric update) is evolutionary stable in many situations, while the emergence of gradual perseveration is less systematic and robust. Overall, our results illustrate that biases can be adaptive and selected by evolution, in an environment-specific manner.
Related papers
- Toward Understanding In-context vs. In-weight Learning [50.24035812301655]
We identify simplified distributional properties that give rise to the emergence and disappearance of in-context learning.
We then extend the study to a full large language model, showing how fine-tuning on various collections of natural language prompts can elicit similar in-context and in-weight learning behaviour.
arXiv Detail & Related papers (2024-10-30T14:09:00Z) - Ask Your Distribution Shift if Pre-Training is Right for You [74.18516460467019]
In practice, fine-tuning a pre-trained model improves robustness significantly in some cases but not at all in others.
We focus on two possible failure modes of models under distribution shift: poor extrapolation and biases in the training data.
Our study suggests that, as a rule of thumb, pre-training can help mitigate poor extrapolation but not dataset biases.
arXiv Detail & Related papers (2024-02-29T23:46:28Z) - Towards Fair Disentangled Online Learning for Changing Environments [28.207499975916324]
We argue that changing environments in online learning can be attributed to partial changes in learned parameters that are specific to environments.
We propose a novel algorithm under the assumption that data collected at each time can be disentangled with two representations.
A novel regret is proposed in which it takes a mixed form of dynamic and static regret metrics followed by a fairness-aware long-term constraint.
arXiv Detail & Related papers (2023-05-31T19:04:16Z) - Variance-Reduced Gradient Estimation via Noise-Reuse in Online Evolution
Strategies [50.10277748405355]
Noise-Reuse Evolution Strategies (NRES) is a general class of unbiased online evolution strategies methods.
We show NRES results in faster convergence than existing AD and ES methods in terms of wall-clock time and number of steps across a variety of applications.
arXiv Detail & Related papers (2023-04-21T17:53:05Z) - Meta-Auxiliary Learning for Adaptive Human Pose Prediction [26.877194503491072]
Predicting high-fidelity future human poses is decisive for intelligent robots to interact with humans.
Deep end-to-end learning approaches, which typically train a generic pre-trained model on external datasets and then directly apply it to all test samples, remain non-optimal.
We propose a novel test-time adaptation framework that leverages two self-supervised auxiliary tasks to help the primary forecasting network adapt to the test sequence.
arXiv Detail & Related papers (2023-04-13T11:17:09Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - When to be critical? Performance and evolvability in different regimes
of neural Ising agents [18.536813548129878]
It has long been hypothesized that operating close to the critical state is beneficial for natural, artificial and their evolutionary systems.
We put this hypothesis to test in a system of evolving foraging agents controlled by neural networks.
Surprisingly, we find that all populations that discover solutions, evolve to be subcritical.
arXiv Detail & Related papers (2023-03-28T17:57:57Z) - Blessings and Curses of Covariate Shifts: Adversarial Learning Dynamics, Directional Convergence, and Equilibria [6.738946307589742]
Covariate distribution shifts and adversarial perturbations present challenges to the conventional statistical learning framework.
This paper precisely characterizes the extrapolation region, examining both regression and classification in an infinite-dimensional setting.
arXiv Detail & Related papers (2022-12-05T18:00:31Z) - Characterizing the robustness of Bayesian adaptive experimental designs
to active learning bias [3.1351527202068445]
We show that active learning bias can afflict Bayesian adaptive experimental design, depending on model misspecification.
We develop an information-theoretic measure of misspecification, and show that worse misspecification implies more severe active learning bias.
arXiv Detail & Related papers (2022-05-27T01:23:11Z) - Agree to Disagree: Diversity through Disagreement for Better
Transferability [54.308327969778155]
We propose D-BAT (Diversity-By-disAgreement Training), which enforces agreement among the models on the training data.
We show how D-BAT naturally emerges from the notion of generalized discrepancy.
arXiv Detail & Related papers (2022-02-09T12:03:02Z) - The Introspective Agent: Interdependence of Strategy, Physiology, and
Sensing for Embodied Agents [51.94554095091305]
We argue for an introspective agent, which considers its own abilities in the context of its environment.
Just as in nature, we hope to reframe strategy as one tool, among many, to succeed in an environment.
arXiv Detail & Related papers (2022-01-02T20:14:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.