Reinforcement Learning with Feedback Graphs
- URL: http://arxiv.org/abs/2005.03789v1
- Date: Thu, 7 May 2020 22:35:37 GMT
- Title: Reinforcement Learning with Feedback Graphs
- Authors: Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik
Sridharan
- Abstract summary: We study episodic reinforcement learning in decision processes when the agent receives additional feedback per step.
We formalize this setting using a feedback graph over state-action pairs and show that model-based algorithms can leverage the additional feedback for more sample-efficient learning.
- Score: 69.1524391595912
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study episodic reinforcement learning in Markov decision processes when
the agent receives additional feedback per step in the form of several
transition observations. Such additional observations are available in a range
of tasks through extended sensors or prior knowledge about the environment
(e.g., when certain actions yield similar outcome). We formalize this setting
using a feedback graph over state-action pairs and show that model-based
algorithms can leverage the additional feedback for more sample-efficient
learning. We give a regret bound that, ignoring logarithmic factors and
lower-order terms, depends only on the size of the maximum acyclic subgraph of
the feedback graph, in contrast with a polynomial dependency on the number of
states and actions in the absence of a feedback graph. Finally, we highlight
challenges when leveraging a small dominating set of the feedback graph as
compared to the bandit setting and propose a new algorithm that can use
knowledge of such a dominating set for more sample-efficient learning of a
near-optimal policy.
Related papers
- GPS: Graph Contrastive Learning via Multi-scale Augmented Views from
Adversarial Pooling [23.450755275125577]
Self-supervised graph representation learning has recently shown considerable promise in a range of fields, including bioinformatics and social networks.
We present a novel approach named Graph Pooling ContraSt (GPS) to address these issues.
Motivated by the fact that graph pooling can adaptively coarsen the graph with the removal of redundancy, we rethink graph pooling and leverage it to automatically generate multi-scale positive views.
arXiv Detail & Related papers (2024-01-29T10:00:53Z) - Spectral Augmentations for Graph Contrastive Learning [50.149996923976836]
Contrastive learning has emerged as a premier method for learning representations with or without supervision.
Recent studies have shown its utility in graph representation learning for pre-training.
We propose a set of well-motivated graph transformation operations to provide a bank of candidates when constructing augmentations for a graph contrastive objective.
arXiv Detail & Related papers (2023-02-06T16:26:29Z) - Coarse-to-Fine Contrastive Learning on Graphs [38.41992365090377]
A variety of graph augmentation strategies have been employed to learn node representations in a self-supervised manner.
We introduce a self-ranking paradigm to ensure that the discriminative information among different nodes can be maintained.
Experiment results on various benchmark datasets verify the effectiveness of our algorithm.
arXiv Detail & Related papers (2022-12-13T08:17:20Z) - Joint graph learning from Gaussian observations in the presence of
hidden nodes [26.133725549667734]
We propose a joint graph learning method that takes into account the presence of hidden (latent) variables.
We exploit the structure resulting from the previous considerations to propose a convex optimization problem.
We compare the proposed algorithm with different baselines and evaluate its performance over synthetic and real-world graphs.
arXiv Detail & Related papers (2022-12-04T13:03:41Z) - Label-invariant Augmentation for Semi-Supervised Graph Classification [32.591130704357184]
Recently, contrastiveness-based augmentation surges a new climax in the computer vision domain.
Unlike images, it is much more difficult to design reasonable augmentations without changing the nature of graphs.
We propose a label-invariant augmentation for graph-structured data to address this challenge.
arXiv Detail & Related papers (2022-05-19T18:44:02Z) - Simulating Bandit Learning from User Feedback for Extractive Question
Answering [51.97943858898579]
We study learning from user feedback for extractive question answering by simulating feedback using supervised data.
We show that systems initially trained on a small number of examples can dramatically improve given feedback from users on model-predicted answers.
arXiv Detail & Related papers (2022-03-18T17:47:58Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z) - Graphing else matters: exploiting aspect opinions and ratings in
explainable graph-based recommendations [66.83527496838937]
We propose to exploit embeddings extracted from graphs that combine information from ratings and aspect-based opinions expressed in textual reviews.
We then adapt and evaluate state-of-the-art graph embedding techniques over graphs generated from Amazon and Yelp reviews on six domains.
Our approach has the advantage of providing explanations which leverage aspect-based opinions given by users about recommended items.
arXiv Detail & Related papers (2021-07-07T13:57:28Z) - Model-Agnostic Graph Regularization for Few-Shot Learning [60.64531995451357]
We present a comprehensive study on graph embedded few-shot learning.
We introduce a graph regularization approach that allows a deeper understanding of the impact of incorporating graph information between labels.
Our approach improves the performance of strong base learners by up to 2% on Mini-ImageNet and 6.7% on ImageNet-FS.
arXiv Detail & Related papers (2021-02-14T05:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.