Convolutional Neural Bandit: Provable Algorithm for Visual-aware
Advertising
- URL: http://arxiv.org/abs/2107.07438v1
- Date: Fri, 2 Jul 2021 03:02:29 GMT
- Title: Convolutional Neural Bandit: Provable Algorithm for Visual-aware
Advertising
- Authors: Yikun Ban, Jingrui He
- Abstract summary: Contextual multi-armed bandit has shown success in the application of advertising to solve the exploration-exploitation dilemma existed in the recommendation procedure.
Inspired by the visual-aware advertising, in this paper, we propose a contextual bandit algorithm.
- Score: 41.30283330958433
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Online advertising is ubiquitous in web business. Image displaying is
considered as one of the most commonly used formats to interact with customers.
Contextual multi-armed bandit has shown success in the application of
advertising to solve the exploration-exploitation dilemma existed in the
recommendation procedure. Inspired by the visual-aware advertising, in this
paper, we propose a contextual bandit algorithm, where the convolutional neural
network (CNN) is utilized to learn the reward function along with an upper
confidence bound (UCB) for exploration. We also prove a near-optimal regret
bound $\tilde{\mathcal{O}}(\sqrt{T})$ when the network is over-parameterized
and establish strong connections with convolutional neural tangent kernel
(CNTK). Finally, we evaluate the empirical performance of the proposed
algorithm and show that it outperforms other state-of-the-art UCB-based bandit
algorithms on real-world image data sets.
Related papers
- Neural Exploitation and Exploration of Contextual Bandits [51.25537742455235]
We study utilizing neural networks for the exploitation and exploration of contextual multi-armed bandits.
EE-Net is a novel neural-based exploitation and exploration strategy.
We show that EE-Net outperforms related linear and neural contextual bandit baselines on real-world datasets.
arXiv Detail & Related papers (2023-05-05T18:34:49Z) - Learning Contextual Bandits Through Perturbed Rewards [107.6210145983805]
We show that a $tildeO(tildedsqrtT)$ regret upper bound is still achievable under standard regularity conditions.
We perturb the rewards when updating the neural network to eliminate the need of explicit exploration.
arXiv Detail & Related papers (2022-01-24T19:10:22Z) - Deep Upper Confidence Bound Algorithm for Contextual Bandit Ranking of
Information Selection [0.0]
Contextual multi-armed bandits (CMAB) have been widely used for learning to filter and prioritize information according to a user's interest.
In this work, we analyze top-K ranking under the CMAB framework where the top-K arms are chosen iteratively to maximize a reward.
We introduce a novel algorithm called the Deep Upper Confidence Bound (UCB) algorithm.
arXiv Detail & Related papers (2021-10-08T13:32:14Z) - EE-Net: Exploitation-Exploration Neural Networks in Contextual Bandits [52.98326168071513]
"EE-Net" is a neural-based bandit approach with a novel exploration strategy.
We show that EE-Net achieves $mathcalO(sqrtTlog T)$ regret, which is tighter than existing state-of-the-art neural bandit algorithms.
arXiv Detail & Related papers (2021-10-07T04:12:36Z) - Neural Contextual Bandits with Deep Representation and Shallow
Exploration [105.8099566651448]
We propose a novel learning algorithm that transforms the raw feature vector using the last hidden layer of a deep ReLU neural network.
Compared with existing neural contextual bandit algorithms, our approach is computationally much more efficient since it only needs to explore in the last layer of the deep neural network.
arXiv Detail & Related papers (2020-12-03T09:17:55Z) - Neural Thompson Sampling [94.82847209157494]
We propose a new algorithm, called Neural Thompson Sampling, which adapts deep neural networks for both exploration and exploitation.
At the core of our algorithm is a novel posterior distribution of the reward, where its mean is the neural network approximator, and its variance is built upon the neural tangent features of the corresponding neural network.
arXiv Detail & Related papers (2020-10-02T07:44:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.