Smoothly Giving up: Robustness for Simple Models
- URL: http://arxiv.org/abs/2302.09114v1
- Date: Fri, 17 Feb 2023 19:48:11 GMT
- Title: Smoothly Giving up: Robustness for Simple Models
- Authors: Tyler Sypherd, Nathan Stromberg, Richard Nock, Visar Berisha, and
Lalitha Sankar
- Abstract summary: Examples of algorithms to train such models include logistic regression and boosting.
We use $Served-Served joint convex loss functions, which tunes between canonical convex loss functions, to robustly train such models.
We also provide results for boosting a COVID-19 dataset for logistic regression, highlighting the efficacy approach across multiple relevant domains.
- Score: 30.56684535186692
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There is a growing need for models that are interpretable and have reduced
energy and computational cost (e.g., in health care analytics and federated
learning). Examples of algorithms to train such models include logistic
regression and boosting. However, one challenge facing these algorithms is that
they provably suffer from label noise; this has been attributed to the joint
interaction between oft-used convex loss functions and simpler hypothesis
classes, resulting in too much emphasis being placed on outliers. In this work,
we use the margin-based $\alpha$-loss, which continuously tunes between
canonical convex and quasi-convex losses, to robustly train simple models. We
show that the $\alpha$ hyperparameter smoothly introduces non-convexity and
offers the benefit of "giving up" on noisy training examples. We also provide
results on the Long-Servedio dataset for boosting and a COVID-19 survey dataset
for logistic regression, highlighting the efficacy of our approach across
multiple relevant domains.
Related papers
- RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold [41.28168368547099]
Training on model-generated synthetic data is a promising approach for finetuning LLMs, but it remains unclear when it helps or hurts.
We show that training on per-step negatives can help to unlearn spurious correlations in the positive data.
arXiv Detail & Related papers (2024-06-20T17:45:54Z) - A Unified Approach to Learning Ising Models: Beyond Independence and
Bounded Width [7.605563562103568]
We revisit the problem of efficiently learning the underlying parameters of Ising models from data.
We show that a simple existing approach based on node-wise logistic regression provably succeeds at recovering the underlying model in several new settings.
arXiv Detail & Related papers (2023-11-15T18:41:19Z) - FABind: Fast and Accurate Protein-Ligand Binding [127.7790493202716]
$mathbfFABind$ is an end-to-end model that combines pocket prediction and docking to achieve accurate and fast protein-ligand binding.
Our proposed model demonstrates strong advantages in terms of effectiveness and efficiency compared to existing methods.
arXiv Detail & Related papers (2023-10-10T16:39:47Z) - Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels [57.46832672991433]
We propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS)
We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises.
We develop an expectation-propagation expectation-maximization algorithm for efficient posterior inference and function estimation.
arXiv Detail & Related papers (2023-10-09T03:55:09Z) - Sample-Efficient Linear Representation Learning from Non-IID Non-Isotropic Data [4.971690889257356]
We introduce an adaptation of the alternating minimization-descent scheme proposed by Collins and Nayer and Vaswani.
We show that vanilla alternating-minimization descent fails catastrophically even for iid, but mildly non-isotropic data.
Our analysis unifies and generalizes prior work, and provides a flexible framework for a wider range of applications.
arXiv Detail & Related papers (2023-08-08T17:56:20Z) - Phantom Embeddings: Using Embedding Space for Model Regularization in
Deep Neural Networks [12.293294756969477]
The strength of machine learning models stems from their ability to learn complex function approximations from data.
The complex models tend to memorize the training data, which results in poor regularization performance on test data.
We present a novel approach to regularize the models by leveraging the information-rich latent embeddings and their high intra-class correlation.
arXiv Detail & Related papers (2023-04-14T17:15:54Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - Surprises in adversarially-trained linear regression [12.33259114006129]
Adversarial training is one of the most effective approaches to defend against such examples.
We show that for linear regression problems, adversarial training can be formulated as a convex problem.
We show that for sufficiently many features or sufficiently small regularization parameters, the learned model perfectly interpolates the training data.
arXiv Detail & Related papers (2022-05-25T11:54:42Z) - ReLU Regression with Massart Noise [52.10842036932169]
We study the fundamental problem of ReLU regression, where the goal is to fit Rectified Linear Units (ReLUs) to data.
We focus on ReLU regression in the Massart noise model, a natural and well-studied semi-random noise model.
We develop an efficient algorithm that achieves exact parameter recovery in this model.
arXiv Detail & Related papers (2021-09-10T02:13:22Z) - Variational Bayesian Unlearning [54.26984662139516]
We study the problem of approximately unlearning a Bayesian model from a small subset of the training data to be erased.
We show that it is equivalent to minimizing an evidence upper bound which trades off between fully unlearning from erased data vs. not entirely forgetting the posterior belief.
In model training with VI, only an approximate (instead of exact) posterior belief given the full data can be obtained, which makes unlearning even more challenging.
arXiv Detail & Related papers (2020-10-24T11:53:00Z) - Least Squares Regression with Markovian Data: Fundamental Limits and
Algorithms [69.45237691598774]
We study the problem of least squares linear regression where the data-points are dependent and are sampled from a Markov chain.
We establish sharp information theoretic minimax lower bounds for this problem in terms of $tau_mathsfmix$.
We propose an algorithm based on experience replay--a popular reinforcement learning technique--that achieves a significantly better error rate.
arXiv Detail & Related papers (2020-06-16T04:26:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.