A Loss-Function for Causal Machine-Learning
- URL: http://arxiv.org/abs/2001.00629v1
- Date: Thu, 2 Jan 2020 21:22:18 GMT
- Title: A Loss-Function for Causal Machine-Learning
- Authors: I-Sheng Yang
- Abstract summary: Causal machine-learning is about predicting the net-effect (true-lift) of treatments.
There is no similarly well-defined loss function due to the lack of point-wise true values in the data.
We propose a novel method to define a loss function in this context, which is equal to mean-square-error (MSE) in a standard regression problem.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Causal machine-learning is about predicting the net-effect (true-lift) of
treatments. Given the data of a treatment group and a control group, it is
similar to a standard supervised-learning problem. Unfortunately, there is no
similarly well-defined loss function due to the lack of point-wise true values
in the data. Many advances in modern machine-learning are not directly
applicable due to the absence of such loss function.
We propose a novel method to define a loss function in this context, which is
equal to mean-square-error (MSE) in a standard regression problem. Our loss
function is universally applicable, thus providing a general standard to
evaluate the quality of any model/strategy that predicts the true-lift. We
demonstrate that despite its novel definition, one can still perform gradient
descent directly on this loss function to find the best fit. This leads to a
new way to train any parameter-based model, such as deep neural networks, to
solve causal machine-learning problems without going through the meta-learner
strategy.
Related papers
- Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest.
Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z) - Just How Flexible are Neural Networks in Practice? [89.80474583606242]
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters.
In practice, however, we only find solutions via our training procedure, including the gradient and regularizers, limiting flexibility.
arXiv Detail & Related papers (2024-06-17T12:24:45Z) - Loss-Free Machine Unlearning [51.34904967046097]
We present a machine unlearning approach that is both retraining- and label-free.
Retraining-free approaches often utilise Fisher information, which is derived from the loss and requires labelled data which may not be available.
We present an extension to the Selective Synaptic Dampening algorithm, substituting the diagonal of the Fisher information matrix for the gradient of the l2 norm of the model output to approximate sensitivity.
arXiv Detail & Related papers (2024-02-29T16:15:34Z) - Enhancing Consistency and Mitigating Bias: A Data Replay Approach for
Incremental Learning [100.7407460674153]
Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks.
To mitigate the problem, a line of methods propose to replay the data of experienced tasks when learning new tasks.
However, it is not expected in practice considering the memory constraint or data privacy issue.
As a replacement, data-free data replay methods are proposed by inverting samples from the classification model.
arXiv Detail & Related papers (2024-01-12T12:51:12Z) - Theoretical Characterization of the Generalization Performance of
Overfitted Meta-Learning [70.52689048213398]
This paper studies the performance of overfitted meta-learning under a linear regression model with Gaussian features.
We find new and interesting properties that do not exist in single-task linear regression.
Our analysis suggests that benign overfitting is more significant and easier to observe when the noise and the diversity/fluctuation of the ground truth of each training task are large.
arXiv Detail & Related papers (2023-04-09T20:36:13Z) - Alternate Loss Functions for Classification and Robust Regression Can Improve the Accuracy of Artificial Neural Networks [6.452225158891343]
This paper shows that training speed and final accuracy of neural networks can significantly depend on the loss function used to train neural networks.
Two new classification loss functions that significantly improve performance on a wide variety of benchmark tasks are proposed.
arXiv Detail & Related papers (2023-03-17T12:52:06Z) - Online Loss Function Learning [13.744076477599707]
Loss function learning aims to automate the task of designing a loss function for a machine learning model.
We propose a new loss function learning technique for adaptively updating the loss function online after each update to the base model parameters.
arXiv Detail & Related papers (2023-01-30T19:22:46Z) - Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models.
Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z) - Nonlinear Monte Carlo Method for Imbalanced Data Learning [43.17123077368725]
In machine learning problems, expected error is used to evaluate model performance.
Inspired by the framework of nonlinear expectation theory, we substitute the mean value of loss function with the maximum value of subgroup mean loss.
We achieve better performance than SOTA backbone models with less training steps, and more robustness for basic regression and imbalanced classification tasks.
arXiv Detail & Related papers (2020-10-27T05:25:09Z) - Online non-convex optimization with imperfect feedback [33.80530308979131]
We consider the problem of online learning with non- losses.
In terms of feedback, we assume that the learner observes - or otherwise constructs - an inexact model for the loss function at each stage.
We propose a mixed-strategy learning policy based on dual averaging.
arXiv Detail & Related papers (2020-10-16T16:53:13Z) - Classification vs regression in overparameterized regimes: Does the loss
function matter? [21.75115239010008]
We show that solutions obtained by least-squares minimum-norm, typically used for regression, are identical to those produced by the hard-margin support vector machine (SVM)
Our results demonstrate the very different roles and properties of loss functions used at the training phase (optimization) and the testing phase (generalization)
arXiv Detail & Related papers (2020-05-16T17:58:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.