Online Semi-Supervised Learning with Bandit Feedback
- URL: http://arxiv.org/abs/2010.12574v1
- Date: Fri, 23 Oct 2020 17:56:38 GMT
- Title: Online Semi-Supervised Learning with Bandit Feedback
- Authors: Sohini Upadhyay, Mikhail Yurochkin, Mayank Agarwal, Yasaman Khazaeni
and DjallelBouneffouf
- Abstract summary: We formulate a new problem at the intersectionof semi-supervised learning and contextual bandits.
We demonstratehow Graph Convolutional Network (GCN), a semi-supervised learning approach, can be adjusted tothe new problem formulation.
- Score: 45.899239661737795
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We formulate a new problem at the intersectionof semi-supervised learning and
contextual bandits,motivated by several applications including clini-cal trials
and ad recommendations. We demonstratehow Graph Convolutional Network (GCN), a
semi-supervised learning approach, can be adjusted tothe new problem
formulation. We also propose avariant of the linear contextual bandit with
semi-supervised missing rewards imputation. We thentake the best of both
approaches to develop multi-GCN embedded contextual bandit. Our algorithmsare
verified on several real world datasets.
Related papers
- Nearly-Optimal Bandit Learning in Stackelberg Games with Side Information [57.287431079644705]
We study the problem of online learning in Stackelberg games with side information between a leader and a sequence of followers.
We provide learning algorithms for the leader which achieve $O(T1/2)$ regret under bandit feedback.
arXiv Detail & Related papers (2025-01-31T22:40:57Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - Rank Flow Embedding for Unsupervised and Semi-Supervised Manifold
Learning [9.171175292808144]
We propose a novel manifold learning algorithm named Rank Flow Embedding (RFE) for unsupervised and semi-supervised scenarios.
RFE computes context-sensitive embeddings, which are refined following a rank-based processing flow.
The generated embeddings can be exploited for more effective unsupervised retrieval or semi-supervised classification.
arXiv Detail & Related papers (2023-04-24T21:02:12Z) - Interpolation-based Correlation Reduction Network for Semi-Supervised
Graph Learning [49.94816548023729]
We propose a novel graph contrastive learning method, termed Interpolation-based Correlation Reduction Network (ICRN)
In our method, we improve the discriminative capability of the latent feature by enlarging the margin of decision boundaries.
By combining the two settings, we extract rich supervision information from both the abundant unlabeled nodes and the rare yet valuable labeled nodes for discnative representation learning.
arXiv Detail & Related papers (2022-06-06T14:26:34Z) - Revisiting Deep Semi-supervised Learning: An Empirical Distribution
Alignment Framework and Its Generalization Bound [97.93945601881407]
We propose a new deep semi-supervised learning framework called Semi-supervised Learning by Empirical Distribution Alignment (SLEDA)
We show the generalization error of semi-supervised learning can be effectively bounded by minimizing the training error on labeled data.
Building upon our new framework and the theoretical bound, we develop a simple and effective deep semi-supervised learning method called Augmented Distribution Alignment Network (ADA-Net)
arXiv Detail & Related papers (2022-03-13T11:59:52Z) - Communication Efficient Federated Learning for Generalized Linear
Bandits [39.1899551748345]
We study generalized linear bandit models under a federated learning setting.
We propose a communication-efficient solution framework that employs online regression for local update and offline regression for global update.
Our algorithm can attain sub-linear rate in both regret and communication cost.
arXiv Detail & Related papers (2022-02-02T15:31:45Z) - Asynchronous Upper Confidence Bound Algorithms for Federated Linear
Bandits [35.47147821038291]
We propose a general framework with asynchronous model update and communication for a collection of homogeneous clients and heterogeneous clients.
Rigorous theoretical analysis is provided about the regret and communication cost under this distributed learning framework.
arXiv Detail & Related papers (2021-10-04T14:01:32Z) - Convolutional Neural Bandit: Provable Algorithm for Visual-aware
Advertising [41.30283330958433]
Contextual multi-armed bandit has shown success in the application of advertising to solve the exploration-exploitation dilemma existed in the recommendation procedure.
Inspired by the visual-aware advertising, in this paper, we propose a contextual bandit algorithm.
arXiv Detail & Related papers (2021-07-02T03:02:29Z) - Learning by Fixing: Solving Math Word Problems with Weak Supervision [70.62896781438694]
Previous neural solvers of math word problems (MWPs) are learned with full supervision and fail to generate diverse solutions.
We introduce a textitweakly-supervised paradigm for learning MWPs.
Our method only requires the annotations of the final answers and can generate various solutions for a single problem.
arXiv Detail & Related papers (2020-12-19T03:10:21Z) - Reinforcement Learning as Iterative and Amortised Inference [62.997667081978825]
We use the control as inference framework to outline a novel classification scheme based on amortised and iterative inference.
We show that taking this perspective allows us to identify parts of the algorithmic design space which have been relatively unexplored.
arXiv Detail & Related papers (2020-06-13T16:10:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.