Tilted Empirical Risk Minimization
- URL: http://arxiv.org/abs/2007.01162v2
- Date: Wed, 17 Mar 2021 16:34:26 GMT
- Title: Tilted Empirical Risk Minimization
- Authors: Tian Li, Ahmad Beirami, Maziar Sanjabi, Virginia Smith
- Abstract summary: We show that it is possible to flexibly tune the impact of individual losses through a straightforward extension to empirical risk minimization.
We show that TERM can increase or decrease the influence of outliers, respectively, to enable fairness or robustness.
It can also enable entirely new applications, such as simultaneously addressing outliers and promoting fairness.
- Score: 26.87656095874882
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Empirical risk minimization (ERM) is typically designed to perform well on
the average loss, which can result in estimators that are sensitive to
outliers, generalize poorly, or treat subgroups unfairly. While many methods
aim to address these problems individually, in this work, we explore them
through a unified framework -- tilted empirical risk minimization (TERM). In
particular, we show that it is possible to flexibly tune the impact of
individual losses through a straightforward extension to ERM using a
hyperparameter called the tilt. We provide several interpretations of the
resulting framework: We show that TERM can increase or decrease the influence
of outliers, respectively, to enable fairness or robustness; has
variance-reduction properties that can benefit generalization; and can be
viewed as a smooth approximation to a superquantile method. We develop batch
and stochastic first-order optimization methods for solving TERM, and show that
the problem can be efficiently solved relative to common alternatives. Finally,
we demonstrate that TERM can be used for a multitude of applications, such as
enforcing fairness between subgroups, mitigating the effect of outliers, and
handling class imbalance. TERM is not only competitive with existing solutions
tailored to these individual problems, but can also enable entirely new
applications, such as simultaneously addressing outliers and promoting
fairness.
Related papers
- Enlarging Feature Support Overlap for Domain Generalization [9.227839292188346]
Invariant risk minimization (IRM) addresses this issue by learning invariant features and minimizing the risk across different domains.
We propose a novel method to enlarge feature support overlap for domain generalization.
Specifically, we introduce Bayesian random data augmentation to increase sample diversity and overcome the deficiency of IRM.
arXiv Detail & Related papers (2024-07-08T09:16:42Z) - Domain Generalization without Excess Empirical Risk [83.26052467843725]
A common approach is designing a data-driven surrogate penalty to capture generalization and minimize the empirical risk jointly with the penalty.
We argue that a significant failure mode of this recipe is an excess risk due to an erroneous penalty or hardness in joint optimization.
We present an approach that eliminates this problem. Instead of jointly minimizing empirical risk with the penalty, we minimize the penalty under the constraint of optimality of the empirical risk.
arXiv Detail & Related papers (2023-08-30T08:46:46Z) - On the Variance, Admissibility, and Stability of Empirical Risk
Minimization [80.26309576810844]
Empirical Risk Minimization (ERM) with squared loss may attain minimax suboptimal error rates.
We show that under mild assumptions, the suboptimality of ERM must be due to large bias rather than variance.
We also show that our estimates imply stability of ERM, complementing the main result of Caponnetto and Rakhlin (2006) for non-Donsker classes.
arXiv Detail & Related papers (2023-05-29T15:25:48Z) - Modeling the Q-Diversity in a Min-max Play Game for Robust Optimization [61.39201891894024]
Group distributionally robust optimization (group DRO) can minimize the worst-case loss over pre-defined groups.
We reformulate the group DRO framework by proposing Q-Diversity.
Characterized by an interactive training mode, Q-Diversity relaxes the group identification from annotation into direct parameterization.
arXiv Detail & Related papers (2023-05-20T07:02:27Z) - Pareto Invariant Risk Minimization [32.01775861630696]
We propose a new optimization scheme for invariant risk minimization (IRM) called PAreto Invariant Risk Minimization (PAIR)
We show PAIR can empower the practical IRM variants to overcome the barriers with the original IRM when provided with proper guidance.
arXiv Detail & Related papers (2022-06-15T19:04:02Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - On Tilted Losses in Machine Learning: Theory and Applications [26.87656095874882]
Exponential tilting is a technique commonly used in fields such as statistics, probability, information theory, and optimization.
We study a simple extension to ERM, which uses exponential tilting to flexibly tune the impact of individual losses.
We find that the framework can consistently outperform ERM and deliver competitive performance with state-of-the-art, problem-specific approaches.
arXiv Detail & Related papers (2021-09-13T17:33:42Z) - KL Guided Domain Adaptation [88.19298405363452]
Domain adaptation is an important problem and often needed for real-world applications.
A common approach in the domain adaptation literature is to learn a representation of the input that has the same distributions over the source and the target domain.
We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples.
arXiv Detail & Related papers (2021-06-14T22:24:23Z) - An Online Learning Approach to Interpolation and Extrapolation in Domain
Generalization [53.592597682854944]
We recast generalization over sub-groups as an online game between a player minimizing risk and an adversary presenting new test.
We show that ERM is provably minimax-optimal for both tasks.
arXiv Detail & Related papers (2021-02-25T19:06:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.