Related papers: Online Semi-Supervised Learning with Bandit Feedback

Online Semi-Supervised Learning with Bandit Feedback

URL: http://arxiv.org/abs/2010.12574v1
Date: Fri, 23 Oct 2020 17:56:38 GMT
Title: Online Semi-Supervised Learning with Bandit Feedback
Authors: Sohini Upadhyay, Mikhail Yurochkin, Mayank Agarwal, Yasaman Khazaeni and DjallelBouneffouf
Abstract summary: We formulate a new problem at the intersectionof semi-supervised learning and contextual bandits. We demonstratehow Graph Convolutional Network (GCN), a semi-supervised learning approach, can be adjusted tothe new problem formulation.
Score: 45.899239661737795
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We formulate a new problem at the intersectionof semi-supervised learning and contextual bandits,motivated by several applications including clini-cal trials and ad recommendations. We demonstratehow Graph Convolutional Network (GCN), a semi-supervised learning approach, can be adjusted tothe new problem formulation. We also propose avariant of the linear contextual bandit with semi-supervised missing rewards imputation. We thentake the best of both approaches to develop multi-GCN embedded contextual bandit. Our algorithmsare verified on several real world datasets.

Related papers

Nearly-Optimal Bandit Learning in Stackelberg Games with Side Information [57.287431079644705]
We study the problem of online learning in Stackelberg games with side information between a leader and a sequence of followers. We provide learning algorithms for the leader which achieve $O(T1/2)$ regret under bandit feedback.
arXiv Detail & Related papers (2025-01-31T22:40:57Z)
Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning. Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z)
Rank Flow Embedding for Unsupervised and Semi-Supervised Manifold Learning [9.171175292808144]
We propose a novel manifold learning algorithm named Rank Flow Embedding (RFE) for unsupervised and semi-supervised scenarios. RFE computes context-sensitive embeddings, which are refined following a rank-based processing flow. The generated embeddings can be exploited for more effective unsupervised retrieval or semi-supervised classification.
arXiv Detail & Related papers (2023-04-24T21:02:12Z)
Interpolation-based Correlation Reduction Network for Semi-Supervised Graph Learning [49.94816548023729]
We propose a novel graph contrastive learning method, termed Interpolation-based Correlation Reduction Network (ICRN) In our method, we improve the discriminative capability of the latent feature by enlarging the margin of decision boundaries. By combining the two settings, we extract rich supervision information from both the abundant unlabeled nodes and the rare yet valuable labeled nodes for discnative representation learning.
arXiv Detail & Related papers (2022-06-06T14:26:34Z)
Revisiting Deep Semi-supervised Learning: An Empirical Distribution Alignment Framework and Its Generalization Bound [97.93945601881407]
We propose a new deep semi-supervised learning framework called Semi-supervised Learning by Empirical Distribution Alignment (SLEDA) We show the generalization error of semi-supervised learning can be effectively bounded by minimizing the training error on labeled data. Building upon our new framework and the theoretical bound, we develop a simple and effective deep semi-supervised learning method called Augmented Distribution Alignment Network (ADA-Net)
arXiv Detail & Related papers (2022-03-13T11:59:52Z)
Communication Efficient Federated Learning for Generalized Linear Bandits [39.1899551748345]
We study generalized linear bandit models under a federated learning setting. We propose a communication-efficient solution framework that employs online regression for local update and offline regression for global update. Our algorithm can attain sub-linear rate in both regret and communication cost.
arXiv Detail & Related papers (2022-02-02T15:31:45Z)
Asynchronous Upper Confidence Bound Algorithms for Federated Linear Bandits [35.47147821038291]
We propose a general framework with asynchronous model update and communication for a collection of homogeneous clients and heterogeneous clients. Rigorous theoretical analysis is provided about the regret and communication cost under this distributed learning framework.
arXiv Detail & Related papers (2021-10-04T14:01:32Z)
Convolutional Neural Bandit: Provable Algorithm for Visual-aware Advertising [41.30283330958433]
Contextual multi-armed bandit has shown success in the application of advertising to solve the exploration-exploitation dilemma existed in the recommendation procedure. Inspired by the visual-aware advertising, in this paper, we propose a contextual bandit algorithm.
arXiv Detail & Related papers (2021-07-02T03:02:29Z)
Learning by Fixing: Solving Math Word Problems with Weak Supervision [70.62896781438694]
Previous neural solvers of math word problems (MWPs) are learned with full supervision and fail to generate diverse solutions. We introduce a textitweakly-supervised paradigm for learning MWPs. Our method only requires the annotations of the final answers and can generate various solutions for a single problem.
arXiv Detail & Related papers (2020-12-19T03:10:21Z)
Reinforcement Learning as Iterative and Amortised Inference [62.997667081978825]
We use the control as inference framework to outline a novel classification scheme based on amortised and iterative inference. We show that taking this perspective allows us to identify parts of the algorithmic design space which have been relatively unexplored.
arXiv Detail & Related papers (2020-06-13T16:10:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.