Related papers: Efficient Online Learning of Optimal Rankings: Dimensionality Reduction via Gradient Descent

Efficient Online Learning of Optimal Rankings: Dimensionality Reduction via Gradient Descent

URL: http://arxiv.org/abs/2011.02817v1
Date: Thu, 5 Nov 2020 13:52:34 GMT
Title: Efficient Online Learning of Optimal Rankings: Dimensionality Reduction via Gradient Descent
Authors: Dimitris Fotakis, Thanasis Lianeas, Georgios Piliouras, Stratis Skoulakis
Abstract summary: Generalized Min-Sum-Set-Cover problem serves as a formal model for the setting above. We show how to achieve low regret for GMSSC in-time.
Score: 47.66497212729108
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider a natural model of online preference aggregation, where sets of preferred items $R_1, R_2, \ldots, R_t$ along with a demand for $k_t$ items in each $R_t$, appear online. Without prior knowledge of $(R_t, k_t)$, the learner maintains a ranking $\pi_t$ aiming that at least $k_t$ items from $R_t$ appear high in $\pi_t$. This is a fundamental problem in preference aggregation with applications to, e.g., ordering product or news items in web pages based on user scrolling and click patterns. The widely studied Generalized Min-Sum-Set-Cover (GMSSC) problem serves as a formal model for the setting above. GMSSC is NP-hard and the standard application of no-regret online learning algorithms is computationally inefficient, because they operate in the space of rankings. In this work, we show how to achieve low regret for GMSSC in polynomial-time. We employ dimensionality reduction from rankings to the space of doubly stochastic matrices, where we apply Online Gradient Descent. A key step is to show how subgradients can be computed efficiently, by solving the dual of a configuration LP. Using oblivious deterministic and randomized rounding schemes, we map doubly stochastic matrices back to rankings with a small loss in the GMSSC objective.

Related papers

Regret Minimization for Piecewise Linear Rewards: Contracts, Auctions, and Beyond [37.92189925462977]
This paper introduces a general online learning framework to tackle regret minimization for piecewise linear rewards. We show that our algorithm attains a regret of $widetildeO(sqrtnT)$, where $n$ is the number of pieces of the reward function and $T$ is the number of rounds. Second, our algorithm demonstrates that, in the problem of learning to set prices in postedprice auctions, it is possible to attain suitable (and desirable) instance-independent regret bounds.
arXiv Detail & Related papers (2025-03-03T16:13:45Z)
Adaptively Learning to Select-Rank in Online Platforms [34.258659206323664]
This research addresses the challenge of adaptively ranking items from a candidate pool for heterogeneous users. We develop a user response model that considers diverse user preferences and the varying effects of item positions. Experiments conducted on both simulated and real-world datasets demonstrate our algorithm outperforms the baseline.
arXiv Detail & Related papers (2024-06-07T15:33:48Z)
Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel. An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost. Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z)
Efficient Frameworks for Generalized Low-Rank Matrix Bandit Problems [61.85150061213987]
We study the generalized low-rank matrix bandit problem, proposed in citelu2021low under the Generalized Linear Model (GLM) framework. To overcome the computational infeasibility and theoretical restrain of existing algorithms, we first propose the G-ESTT framework. We show that G-ESTT can achieve the $tildeO(sqrt(d_1+d_2)3/2Mr3/2T)$ bound of regret while G-ESTS can achineve the $tildeO
arXiv Detail & Related papers (2024-01-14T14:14:19Z)
Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z)
Sketchy: Memory-efficient Adaptive Regularization with Frequent Directions [22.09320263962004]
We find the spectra of the Kronecker-factored gradient covariance matrix in deep learning (DL) training tasks are concentrated on a small leading eigenspace. We describe a generic method for reducing memory and compute requirements of maintaining a matrix preconditioner. We show extensions of our work to Shampoo, resulting in a method competitive in quality with Shampoo and Adam, yet requiring only sub-linear memory for tracking second moments.
arXiv Detail & Related papers (2023-02-07T21:50:06Z)
Large Scale Private Learning via Low-rank Reparametrization [77.38947817228656]
We propose a reparametrization scheme to address the challenges of applying differentially private SGD on large neural networks. We are the first able to apply differential privacy on the BERT model and achieve an average accuracy of $83.9%$ on four downstream tasks.
arXiv Detail & Related papers (2021-06-17T10:14:43Z)
Episodic Linear Quadratic Regulators with Low-rank Transitions [31.8243883890202]
We propose an algorithm that utilizes the intrinsic system low-rank structure for efficient learning. Our algorithm achieves a $K$-episode regret bound of order $widetildeO(m3/2 K1/2)$.
arXiv Detail & Related papers (2020-11-03T08:48:31Z)
Learning to Rank under Multinomial Logit Choice [6.929312022493406]
Learning the optimal ordering of content is an important challenge in website design. We present theoretical analysis leading to an $Omega(sqrtJT)$ lower bound for the problem, and an $tildeO(sqrtJT)$ upper bound on regret of the UCB algorithm.
arXiv Detail & Related papers (2020-09-07T16:15:12Z)
Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping [99.59319332864129]
In this paper, we study reinforcement learning for discounted Decision (MDP) We propose a novel algorithm that makes use of the feature mapping and obtains a $tilde O(dsqrtT/ (1-gamma)2)$ regret. Our upper and lower bound results together suggest that the proposed reinforcement learning algorithm is near-optimal up to a $ (1-gamma)-0.5$ factor.
arXiv Detail & Related papers (2020-06-23T17:08:54Z)
Rank $2r$ iterative least squares: efficient recovery of ill-conditioned low rank matrices from few entries [4.230158563771147]
We present a new, simple and computationally efficient iterative method for low rank matrix completion. Our algorithm, denoted R2RILS for rank $2r$ iterative least squares, has low memory requirements.
arXiv Detail & Related papers (2020-02-05T16:20:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.