Related papers: Online Learning with Primary and Secondary Losses

Online Learning with Primary and Secondary Losses

URL: http://arxiv.org/abs/2010.14670v1
Date: Tue, 27 Oct 2020 23:50:27 GMT
Title: Online Learning with Primary and Secondary Losses
Authors: Avrim Blum, Han Shao
Abstract summary: We study the problem of online learning with primary and secondary losses. We consider the question: Can we combine "expert advice" to achieve low regret with respect to the primary loss?
Score: 29.540528603828722
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the problem of online learning with primary and secondary losses. For example, a recruiter making decisions of which job applicants to hire might weigh false positives and false negatives equally (the primary loss) but the applicants might weigh false negatives much higher (the secondary loss). We consider the following question: Can we combine "expert advice" to achieve low regret with respect to the primary loss, while at the same time performing {\em not much worse than the worst expert} with respect to the secondary loss? Unfortunately, we show that this goal is unachievable without any bounded variance assumption on the secondary loss. More generally, we consider the goal of minimizing the regret with respect to the primary loss and bounding the secondary loss by a linear threshold. On the positive side, we show that running any switching-limited algorithm can achieve this goal if all experts satisfy the assumption that the secondary loss does not exceed the linear threshold by $o(T)$ for any time interval. If not all experts satisfy this assumption, our algorithms can achieve this goal given access to some external oracles which determine when to deactivate and reactivate experts.

Related papers

When Lower-Order Terms Dominate: Adaptive Expert Algorithms for Heavy-Tailed Losses [12.39860047886679]
We develop adaptive algorithms that do not require prior knowledge about the range or the second moment of the losses.<n>Existing adaptive algorithms have what is typically considered a lower-order term in their regret guarantees.
arXiv Detail & Related papers (2025-06-02T14:29:05Z)
Online Algorithm for Aggregating Experts' Predictions with Unbounded Quadratic Loss [72.32459441619388]
We propose an algorithm for aggregating expert predictions which does not require a prior knowledge of the upper bound on the losses. The algorithm is based on the exponential reweighing of expert losses.
arXiv Detail & Related papers (2025-01-11T10:52:59Z)
LEARN: An Invex Loss for Outlier Oblivious Robust Online Optimization [56.67706781191521]
An adversary can introduce outliers by corrupting loss functions in an arbitrary number of k, unknown to the learner. We present a robust online rounds optimization framework, where an adversary can introduce outliers by corrupting loss functions in an arbitrary number of k, unknown.
arXiv Detail & Related papers (2024-08-12T17:08:31Z)
Proper losses regret at least 1/2-order [1.7986307635828565]
We show that the strict properness of a loss is necessary and sufficient to establish a non-vacuous surrogate regret bound. We also show that the order of convergence in p-norm cannot be faster than the $1/2$-order of surrogate regrets for a broad class of strictly proper losses.
arXiv Detail & Related papers (2024-07-15T03:46:15Z)
Learning to Cover: Online Learning and Optimization with Irreversible Decisions [50.5775508521174]
We find that regret grows sub-linearly at a rate $Thetaleft(mfrac12cdotfrac11-2-Tright)$, thus converging exponentially fast to $Theta(sqrtm)$. These findings underscore the benefits of limited online learning and optimization, in that even a few rounds can provide significant benefits as compared to a no-learning baseline.
arXiv Detail & Related papers (2024-06-20T23:00:25Z)
Multi-Agent Imitation Learning: Value is Easy, Regret is Hard [52.31989962031179]
We study a multi-agent imitation learning (MAIL) problem where we take the perspective of a learner attempting to coordinate a group of agents. Most prior work in MAIL essentially reduces the problem to matching the behavior of the expert within the support of the demonstrations. While doing so is sufficient to drive the value gap between the learner and the expert to zero under the assumption that agents are non-strategic, it does not guarantee to deviations by strategic agents.
arXiv Detail & Related papers (2024-06-06T16:18:20Z)
Avoiding Catastrophe in Online Learning by Asking for Help [7.881265948305421]
We propose an online learning problem where the goal is to minimize the chance of catastrophe.<n>We first show that in general, any algorithm either queries the mentor at a linear rate or is nearly guaranteed to cause catastrophe.<n>We provide an algorithm whose regret and rate of querying the mentor both approach 0 as the time horizon grows.
arXiv Detail & Related papers (2024-02-12T21:12:11Z)
Balancing Adaptability and Non-exploitability in Repeated Games [29.04618208068207]
We study the problem of guaranteeing low regret in repeated games against an opponent with unknown membership in one of several classes. We add the constraint that our algorithm is non-exploitable, in that the opponent lacks an incentive to use an algorithm against which we cannot achieve rewards exceeding some "fair" value. Our solution is an expert algorithm (LAFF) that searches within a set of sub-algorithms that are optimal for each opponent class and uses a punishment policy upon detecting evidence of exploitation by the opponent.
arXiv Detail & Related papers (2021-12-20T03:09:30Z)
Mixing between the Cross Entropy and the Expectation Loss Terms [89.30385901335323]
Cross entropy loss tends to focus on hard to classify samples during training. We show that adding to the optimization goal the expectation loss helps the network to achieve better accuracy. Our experiments show that the new training protocol improves performance across a diverse set of classification domains.
arXiv Detail & Related papers (2021-09-12T23:14:06Z)
Being Properly Improper [36.52509571098292]
We analyse class probability-based losses when they are stripped off the mandatory properness. We show that a natural extension of a half-century old loss introduced by S. Arimoto is twist proper. We then turn to a theory that has provided some of the best off-the-shelf algorithms for proper losses, boosting.
arXiv Detail & Related papers (2021-06-18T05:00:15Z)
Online non-convex optimization with imperfect feedback [33.80530308979131]
We consider the problem of online learning with non- losses. In terms of feedback, we assume that the learner observes - or otherwise constructs - an inexact model for the loss function at each stage. We propose a mixed-strategy learning policy based on dual averaging.
arXiv Detail & Related papers (2020-10-16T16:53:13Z)
Optimally Deceiving a Learning Leader in Stackelberg Games [123.14187606686006]
Recent results in the ML community have revealed that learning algorithms used to compute the optimal strategy for the leader to commit to in a Stackelberg game, are susceptible to manipulation by the follower. This paper shows that it is always possible for the follower to compute (near-optimal) payoffs for various scenarios about the learning interaction between leader and follower.
arXiv Detail & Related papers (2020-06-11T16:18:21Z)
Calibrated Surrogate Losses for Adversarially Robust Classification [92.37268323142307]
We show that no convex surrogate loss is respect with respect to adversarial 0-1 loss when restricted to linear models. We also show that if the underlying distribution satisfies the Massart's noise condition, convex losses can also be calibrated in the adversarial setting.
arXiv Detail & Related papers (2020-05-28T02:40:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.