Selective Credit Assignment
- URL: http://arxiv.org/abs/2202.09699v1
- Date: Sun, 20 Feb 2022 00:07:57 GMT
- Title: Selective Credit Assignment
- Authors: Veronica Chelu, Diana Borsa, Doina Precup, Hado van Hasselt
- Abstract summary: We describe a unified view on temporal-difference algorithms for selective credit assignment.
We present insights into applying weightings to value-based learning and planning algorithms.
- Score: 57.41789233550586
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Efficient credit assignment is essential for reinforcement learning
algorithms in both prediction and control settings. We describe a unified view
on temporal-difference algorithms for selective credit assignment. These
selective algorithms apply weightings to quantify the contribution of learning
updates. We present insights into applying weightings to value-based learning
and planning algorithms, and describe their role in mediating the backward
credit distribution in prediction and control. Within this space, we identify
some existing online learning algorithms that can assign credit selectively as
special cases, as well as add new algorithms that assign credit backward in
time counterfactually, allowing credit to be assigned off-trajectory and
off-policy.
Related papers
- The Role of Learning Algorithms in Collective Action [8.955918346078935]
We show that the effective size and success of a collective are highly dependent on the properties of the learning algorithm.
This highlights the necessity of taking the learning algorithm into account when studying the impact of collective action in machine learning.
arXiv Detail & Related papers (2024-05-10T16:36:59Z) - Learning-Augmented Algorithms with Explicit Predictors [67.02156211760415]
Recent advances in algorithmic design show how to utilize predictions obtained by machine learning models from past and present data.
Prior research in this context was focused on a paradigm where the predictor is pre-trained on past data and then used as a black box.
In this work, we unpack the predictor and integrate the learning problem it gives rise for within the algorithmic challenge.
arXiv Detail & Related papers (2024-03-12T08:40:21Z) - Tree-Based Adaptive Model Learning [62.997667081978825]
We extend the Kearns-Vazirani learning algorithm to handle systems that change over time.
We present a new learning algorithm that can reuse and update previously learned behavior, implement it in the LearnLib library, and evaluate it on large examples.
arXiv Detail & Related papers (2022-08-31T21:24:22Z) - Non-Clairvoyant Scheduling with Predictions Revisited [77.86290991564829]
In non-clairvoyant scheduling, the task is to find an online strategy for scheduling jobs with a priori unknown processing requirements.
We revisit this well-studied problem in a recently popular learning-augmented setting that integrates (untrusted) predictions in algorithm design.
We show that these predictions have desired properties, admit a natural error measure as well as algorithms with strong performance guarantees.
arXiv Detail & Related papers (2022-02-21T13:18:11Z) - Learning Predictions for Algorithms with Predictions [49.341241064279714]
We introduce a general design approach for algorithms that learn predictors.
We apply techniques from online learning to learn against adversarial instances, tune robustness-consistency trade-offs, and obtain new statistical guarantees.
We demonstrate the effectiveness of our approach at deriving learning algorithms by analyzing methods for bipartite matching, page migration, ski-rental, and job scheduling.
arXiv Detail & Related papers (2022-02-18T17:25:43Z) - Learning to Actively Learn: A Robust Approach [22.75298609290053]
This work proposes a procedure for designing algorithms for adaptive data collection tasks like active learning and pure-exploration multi-armed bandits.
Our adaptive algorithm is learned via adversarial training over equivalence classes of problems derived from information theoretic lower bounds.
We perform synthetic experiments to justify the stability and effectiveness of the training procedure, and then evaluate the method on tasks derived from real data.
arXiv Detail & Related papers (2020-10-29T06:48:22Z) - Mastering Rate based Curriculum Learning [78.45222238426246]
We argue that the notion of learning progress itself has several shortcomings that lead to a low sample efficiency for the learner.
We propose a new algorithm, based on the notion of mastering rate, that significantly outperforms learning progress-based algorithms.
arXiv Detail & Related papers (2020-08-14T16:34:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.