Uplift Modeling Under Limited Supervision
- URL: http://arxiv.org/abs/2403.19289v4
- Date: Mon, 2 Sep 2024 20:21:29 GMT
- Title: Uplift Modeling Under Limited Supervision
- Authors: George Panagopoulos, Daniele Malitesta, Fragkiskos D. Malliaros, Jun Pang,
- Abstract summary: Estimating causal effects in e-commerce tends to involve costly treatment assignments which can be impractical in large-scale settings.
We propose a graph neural network to diminish the required training set size, relying on graphs that are common in e-commerce data.
- Score: 11.548203301440179
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Estimating causal effects in e-commerce tends to involve costly treatment assignments which can be impractical in large-scale settings. Leveraging machine learning to predict such treatment effects without actual intervention is a standard practice to diminish the risk. However, existing methods for treatment effect prediction tend to rely on training sets of substantial size, which are built from real experiments and are thus inherently risky to create. In this work we propose a graph neural network to diminish the required training set size, relying on graphs that are common in e-commerce data. Specifically, we view the problem as node regression with a restricted number of labeled instances, develop a two-model neural architecture akin to previous causal effect estimators, and test varying message-passing layers for encoding. Furthermore, as an extra step, we combine the model with an acquisition function to guide the creation of the training set in settings with extremely low experimental budget. The framework is flexible since each step can be used separately with other models or treatment policies. The experiments on real large-scale networks indicate a clear advantage of our methodology over the state of the art, which in many cases performs close to random, underlining the need for models that can generalize with limited supervision to reduce experimental risks.
Related papers
- Low-rank finetuning for LLMs: A fairness perspective [54.13240282850982]
Low-rank approximation techniques have become the de facto standard for fine-tuning Large Language Models.
This paper investigates the effectiveness of these methods in capturing the shift of fine-tuning datasets from the initial pre-trained data distribution.
We show that low-rank fine-tuning inadvertently preserves undesirable biases and toxic behaviors.
arXiv Detail & Related papers (2024-05-28T20:43:53Z) - An Emulator for Fine-Tuning Large Language Models using Small Language
Models [91.02498576056057]
We introduce emulated fine-tuning (EFT), a principled and practical method for sampling from a distribution that approximates the result of pre-training and fine-tuning at different scales.
We show that EFT enables test-time adjustment of competing behavioral traits like helpfulness and harmlessness without additional training.
Finally, a special case of emulated fine-tuning, which we call LM up-scaling, avoids resource-intensive fine-tuning of large pre-trained models by ensembling them with small fine-tuned models.
arXiv Detail & Related papers (2023-10-19T17:57:16Z) - On the Embedding Collapse when Scaling up Recommendation Models [53.66285358088788]
We identify the embedding collapse phenomenon as the inhibition of scalability, wherein the embedding matrix tends to occupy a low-dimensional subspace.
We propose a simple yet effective multi-embedding design incorporating embedding-set-specific interaction modules to learn embedding sets with large diversity.
arXiv Detail & Related papers (2023-10-06T17:50:38Z) - Model Complexity of Program Phases [0.5439020425818999]
In resource limited computing systems, sequence prediction models must operate under tight constraints.
Various models are available that cater to prediction under these conditions that in some way focus on reducing the cost of implementation.
These resource constrained sequence prediction models, in practice, exhibit a fundamental tradeoff between the cost of implementation and the quality of its predictions.
arXiv Detail & Related papers (2023-10-05T19:50:15Z) - Regression modelling of spatiotemporal extreme U.S. wildfires via
partially-interpretable neural networks [0.0]
We propose a new methodological framework for performing extreme quantile regression using artificial neutral networks.
We unify linear, and additive, regression methodology with deep learning to create partially-interpretable neural networks.
arXiv Detail & Related papers (2022-08-16T07:42:53Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Distributed Learning and its Application for Time-Series Prediction [0.0]
Extreme events are occurrences whose magnitude and potential cause extensive damage on people, infrastructure, and the environment.
Motivated by the extreme nature of the current global health landscape, which is plagued by the coronavirus pandemic, we seek to better understand and model extreme events.
arXiv Detail & Related papers (2021-06-06T18:57:30Z) - A Twin Neural Model for Uplift [59.38563723706796]
Uplift is a particular case of conditional treatment effect modeling.
We propose a new loss function defined by leveraging a connection with the Bayesian interpretation of the relative risk.
We show our proposed method is competitive with the state-of-the-art in simulation setting and on real data from large scale randomized experiments.
arXiv Detail & Related papers (2021-05-11T16:02:39Z) - Experimental Design for Overparameterized Learning with Application to
Single Shot Deep Active Learning [5.141687309207561]
Modern machine learning models are trained on large amounts of labeled data.
Access to large volumes of labeled data is often limited or expensive.
We propose a new design strategy for curating the training set.
arXiv Detail & Related papers (2020-09-27T11:27:49Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.