The Value of Information When Deciding What to Learn
- URL: http://arxiv.org/abs/2110.13973v1
- Date: Tue, 26 Oct 2021 19:23:12 GMT
- Title: The Value of Information When Deciding What to Learn
- Authors: Dilip Arumugam and Benjamin Van Roy
- Abstract summary: This work builds upon the seminal design principle of information-directed sampling (Russo & Van Roy, 2014)
We offer new insights into learning targets from the literature on rate-distortion theory before turning to empirical results that confirm the value of information when deciding what to learn.
- Score: 21.945359614094503
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: All sequential decision-making agents explore so as to acquire knowledge
about a particular target. It is often the responsibility of the agent designer
to construct this target which, in rich and complex environments, constitutes a
onerous burden; without full knowledge of the environment itself, a designer
may forge a sub-optimal learning target that poorly balances the amount of
information an agent must acquire to identify the target against the target's
associated performance shortfall. While recent work has developed a connection
between learning targets and rate-distortion theory to address this challenge
and empower agents that decide what to learn in an automated fashion, the
proposed algorithm does not optimally tackle the equally important challenge of
efficient information acquisition. In this work, building upon the seminal
design principle of information-directed sampling (Russo & Van Roy, 2014), we
address this shortcoming directly to couple optimal information acquisition
with the optimal design of learning targets. Along the way, we offer new
insights into learning targets from the literature on rate-distortion theory
before turning to empirical results that confirm the value of information when
deciding what to learn.
Related papers
- Unveiling Entity-Level Unlearning for Large Language Models: A Comprehensive Analysis [32.455702022397666]
Large language model unlearning has garnered increasing attention due to its potential to address security and privacy concerns.
Much of this research has concentrated on instance-level unlearning, specifically targeting the removal of predefined instances containing sensitive content.
We propose a novel task of entity-level unlearning, which aims to erase entity-related knowledge from the target model completely.
arXiv Detail & Related papers (2024-06-22T09:40:07Z) - Collaborative Knowledge Infusion for Low-resource Stance Detection [83.88515573352795]
Target-related knowledge is often needed to assist stance detection models.
We propose a collaborative knowledge infusion approach for low-resource stance detection tasks.
arXiv Detail & Related papers (2024-03-28T08:32:14Z) - Information-Theoretic Odometry Learning [83.36195426897768]
We propose a unified information theoretic framework for learning-motivated methods aimed at odometry estimation.
The proposed framework provides an elegant tool for performance evaluation and understanding in information-theoretic language.
arXiv Detail & Related papers (2022-03-11T02:37:35Z) - Generative multitask learning mitigates target-causing confounding [61.21582323566118]
We propose a simple and scalable approach to causal representation learning for multitask learning.
The improvement comes from mitigating unobserved confounders that cause the targets, but not the input.
Our results on the Attributes of People and Taskonomy datasets reflect the conceptual improvement in robustness to prior probability shift.
arXiv Detail & Related papers (2022-02-08T20:42:14Z) - Goal-Aware Cross-Entropy for Multi-Target Reinforcement Learning [15.33496710690063]
We propose goal-aware cross-entropy (GACE) loss, that can be utilized in a self-supervised way.
We then devise goal-discriminative attention networks (GDAN) which utilize the goal-relevant information to focus on the given instruction.
arXiv Detail & Related papers (2021-10-25T14:24:39Z) - Understanding the origin of information-seeking exploration in
probabilistic objectives for control [62.997667081978825]
An exploration-exploitation trade-off is central to the description of adaptive behaviour.
One approach to solving this trade-off has been to equip or propose that agents possess an intrinsic 'exploratory drive'
We show that this combination of utility maximizing and information-seeking behaviour arises from the minimization of an entirely difference class of objectives.
arXiv Detail & Related papers (2021-03-11T18:42:39Z) - Reinforcement Learning, Bit by Bit [27.66567077899924]
Reinforcement learning agents have demonstrated remarkable achievements in simulated environments.
Data efficiency poses an impediment to carrying this success over to real environments.
We discuss concepts and regret analysis that together offer principled guidance.
arXiv Detail & Related papers (2021-03-06T06:37:46Z) - Sequential Transfer in Reinforcement Learning with a Generative Model [48.40219742217783]
We show how to reduce the sample complexity for learning new tasks by transferring knowledge from previously-solved ones.
We derive PAC bounds on its sample complexity which clearly demonstrate the benefits of using this kind of prior knowledge.
We empirically verify our theoretical findings in simple simulated domains.
arXiv Detail & Related papers (2020-07-01T19:53:35Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.