Nonlinear Monte Carlo Method for Imbalanced Data Learning
- URL: http://arxiv.org/abs/2010.14060v2
- Date: Thu, 27 May 2021 11:44:34 GMT
- Title: Nonlinear Monte Carlo Method for Imbalanced Data Learning
- Authors: Xuli Shen, Qing Xu, Xiangyang Xue
- Abstract summary: In machine learning problems, expected error is used to evaluate model performance.
Inspired by the framework of nonlinear expectation theory, we substitute the mean value of loss function with the maximum value of subgroup mean loss.
We achieve better performance than SOTA backbone models with less training steps, and more robustness for basic regression and imbalanced classification tasks.
- Score: 43.17123077368725
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For basic machine learning problems, expected error is used to evaluate model
performance. Since the distribution of data is usually unknown, we can make
simple hypothesis that the data are sampled independently and identically
distributed (i.i.d.) and the mean value of loss function is used as the
empirical risk by Law of Large Numbers (LLN). This is known as the Monte Carlo
method. However, when LLN is not applicable, such as imbalanced data problems,
empirical risk will cause overfitting and might decrease robustness and
generalization ability. Inspired by the framework of nonlinear expectation
theory, we substitute the mean value of loss function with the maximum value of
subgroup mean loss. We call it nonlinear Monte Carlo method. In order to use
numerical method of optimization, we linearize and smooth the functional of
maximum empirical risk and get the descent direction via quadratic programming.
With the proposed method, we achieve better performance than SOTA backbone
models with less training steps, and more robustness for basic regression and
imbalanced classification tasks.
Related papers
- Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning [28.059563581973432]
Large Language Models (LLMs) often have sensitive, private, or copyrighted data during pre-training.
LLMs unlearning aims to eliminate the influence of undesirable data from the pre-trained model.
We propose Negative Preference Optimization (NPO) as a simple alignment-inspired method that could efficiently unlearn a target dataset.
arXiv Detail & Related papers (2024-04-08T21:05:42Z) - On the Performance of Empirical Risk Minimization with Smoothed Data [59.3428024282545]
Empirical Risk Minimization (ERM) is able to achieve sublinear error whenever a class is learnable with iid data.
We show that ERM is able to achieve sublinear error whenever a class is learnable with iid data.
arXiv Detail & Related papers (2024-02-22T21:55:41Z) - On Learning Mixture of Linear Regressions in the Non-Realizable Setting [44.307245411703704]
We show that mixture of linear regressions (MLR) can be used for prediction where instead of predicting a label, the model predicts a list of values.
In this paper we show that a version of the popular minimization (AM) algorithm finds the best fit lines in a dataset even when a realizable model is not assumed.
arXiv Detail & Related papers (2022-05-26T05:34:57Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Meta-learning with negative learning rates [3.42658286826597]
Deep learning models require a large amount of data to perform well.
When data is scarce for a target task, we can transfer the knowledge gained by training on similar tasks to quickly learn the target.
A successful approach is meta-learning, or learning to learn a distribution of tasks, where learning is represented by an outer loop, and to learn by an inner loop of gradient descent.
arXiv Detail & Related papers (2021-02-01T16:14:14Z) - Is Pessimism Provably Efficient for Offline RL? [104.00628430454479]
We study offline reinforcement learning (RL), which aims to learn an optimal policy based on a dataset collected a priori.
We propose a pessimistic variant of the value iteration algorithm (PEVI), which incorporates an uncertainty quantifier as the penalty function.
arXiv Detail & Related papers (2020-12-30T09:06:57Z) - LQF: Linear Quadratic Fine-Tuning [114.3840147070712]
We present the first method for linearizing a pre-trained model that achieves comparable performance to non-linear fine-tuning.
LQF consists of simple modifications to the architecture, loss function and optimization typically used for classification.
arXiv Detail & Related papers (2020-12-21T06:40:20Z) - Semi-Supervised Empirical Risk Minimization: Using unlabeled data to
improve prediction [4.860671253873579]
We present a general methodology for using unlabeled data to design semi supervised learning (SSL) variants of the Empirical Risk Minimization (ERM) learning process.
We analyze of the effectiveness of our SSL approach in improving prediction performance.
arXiv Detail & Related papers (2020-09-01T17:55:51Z) - A Loss-Function for Causal Machine-Learning [0.0]
Causal machine-learning is about predicting the net-effect (true-lift) of treatments.
There is no similarly well-defined loss function due to the lack of point-wise true values in the data.
We propose a novel method to define a loss function in this context, which is equal to mean-square-error (MSE) in a standard regression problem.
arXiv Detail & Related papers (2020-01-02T21:22:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.