Online Learning with Uncertain Feedback Graphs
- URL: http://arxiv.org/abs/2106.08441v1
- Date: Tue, 15 Jun 2021 21:21:30 GMT
- Title: Online Learning with Uncertain Feedback Graphs
- Authors: Pouya M Ghari, Yanning Shen
- Abstract summary: The relationship among experts can be captured by a feedback graph, which can be used to assist the learner's decision making.
In practice, the nominal feedback graph often entails uncertainties, which renders it impossible to reveal the actual relationship among experts.
The present work studies various cases of potential uncertainties, and develops novel online learning algorithms to deal with them.
- Score: 12.805267089186533
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Online learning with expert advice is widely used in various machine learning
tasks. It considers the problem where a learner chooses one from a set of
experts to take advice and make a decision. In many learning problems, experts
may be related, henceforth the learner can observe the losses associated with a
subset of experts that are related to the chosen one. In this context, the
relationship among experts can be captured by a feedback graph, which can be
used to assist the learner's decision making. However, in practice, the nominal
feedback graph often entails uncertainties, which renders it impossible to
reveal the actual relationship among experts. To cope with this challenge, the
present work studies various cases of potential uncertainties, and develops
novel online learning algorithms to deal with uncertainties while making use of
the uncertain feedback graph. The proposed algorithms are proved to enjoy
sublinear regret under mild conditions. Experiments on real datasets are
presented to demonstrate the effectiveness of the novel algorithms.
Related papers
- Learning More Generalized Experts by Merging Experts in Mixture-of-Experts [0.5221459608786241]
We show that incorporating a shared layer in a mixture-of-experts can lead to performance degradation.
We merge the two most frequently selected experts and update the least frequently selected expert using the combination of experts.
Our algorithm enhances transfer learning and mitigates catastrophic forgetting when applied to multi-domain task incremental learning.
arXiv Detail & Related papers (2024-05-19T11:55:48Z) - Sequential Decision Making with Expert Demonstrations under Unobserved Heterogeneity [22.0059059325909]
We study the problem of online sequential decision-making given auxiliary demonstrations from experts who made their decisions based on unobserved contextual information.
This setting arises in many application domains, such as self-driving cars, healthcare, and finance.
We propose the Experts-as-Priors algorithm (ExPerior) to establish an informative prior distribution over the learner's decision-making problem.
arXiv Detail & Related papers (2024-04-10T18:00:17Z) - Causal Discovery with Language Models as Imperfect Experts [119.22928856942292]
We consider how expert knowledge can be used to improve the data-driven identification of causal graphs.
We propose strategies for amending such expert knowledge based on consistency properties.
We report a case study, on real data, where a large language model is used as an imperfect expert.
arXiv Detail & Related papers (2023-07-05T16:01:38Z) - Leveraging Skill-to-Skill Supervision for Knowledge Tracing [13.753990664747265]
Knowledge tracing plays a pivotal role in intelligent tutoring systems.
Recent advances in knowledge tracing models have enabled better exploitation of problem solving history.
Knowledge tracing algorithms that incorporate knowledge directly are important to settings with limited data or cold starts.
arXiv Detail & Related papers (2023-06-12T03:23:22Z) - Bayesian Q-learning With Imperfect Expert Demonstrations [56.55609745121237]
We propose a novel algorithm to speed up Q-learning with the help of a limited amount of imperfect expert demonstrations.
We evaluate our approach on a sparse-reward chain environment and six more complicated Atari games with delayed rewards.
arXiv Detail & Related papers (2022-10-01T17:38:19Z) - Continuous Prediction with Experts' Advice [10.98975673892221]
Prediction with experts' advice is one of the most fundamental problems in online learning.
Recent work has looked at online learning through the lens of differential equations and continuous-time analysis.
arXiv Detail & Related papers (2022-06-01T05:09:20Z) - On Covariate Shift of Latent Confounders in Imitation and Reinforcement
Learning [69.48387059607387]
We consider the problem of using expert data with unobserved confounders for imitation and reinforcement learning.
We analyze the limitations of learning from confounded expert data with and without external reward.
We validate our claims empirically on challenging assistive healthcare and recommender system simulation tasks.
arXiv Detail & Related papers (2021-10-13T07:31:31Z) - Exploring Bayesian Deep Learning for Urgent Instructor Intervention Need
in MOOC Forums [58.221459787471254]
Massive Open Online Courses (MOOCs) have become a popular choice for e-learning thanks to their great flexibility.
Due to large numbers of learners and their diverse backgrounds, it is taxing to offer real-time support.
With the large volume of posts and high workloads for MOOC instructors, it is unlikely that the instructors can identify all learners requiring intervention.
This paper explores for the first time Bayesian deep learning on learner-based text posts with two methods: Monte Carlo Dropout and Variational Inference.
arXiv Detail & Related papers (2021-04-26T15:12:13Z) - Low-Regret Active learning [64.36270166907788]
We develop an online learning algorithm for identifying unlabeled data points that are most informative for training.
At the core of our work is an efficient algorithm for sleeping experts that is tailored to achieve low regret on predictable (easy) instances.
arXiv Detail & Related papers (2021-04-06T22:53:45Z) - Decision Rule Elicitation for Domain Adaptation [93.02675868486932]
Human-in-the-loop machine learning is widely used in artificial intelligence (AI) to elicit labels from experts.
In this work, we allow experts to additionally produce decision rules describing their decision-making.
We show that decision rule elicitation improves domain adaptation of the algorithm and helps to propagate expert's knowledge to the AI model.
arXiv Detail & Related papers (2021-02-23T08:07:22Z) - Consistent Estimators for Learning to Defer to an Expert [5.076419064097734]
We show how to learn predictors that can either predict or choose to defer the decision to a downstream expert.
We show the effectiveness of our approach on a variety of experimental tasks.
arXiv Detail & Related papers (2020-06-02T18:21:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.