Information Theoretic Counterfactual Learning from Missing-Not-At-Random
Feedback
- URL: http://arxiv.org/abs/2009.02623v2
- Date: Sat, 17 Oct 2020 13:54:54 GMT
- Title: Information Theoretic Counterfactual Learning from Missing-Not-At-Random
Feedback
- Authors: Zifeng Wang and Xi Chen and Rui Wen and Shao-Lun Huang and Ercan E.
Kuruoglu and Yefeng Zheng
- Abstract summary: We build an information-theoretic counterfactual variational information bottleneck (CVIB) for dealing with missing-not-at-random data.
By separating the task-aware mutual information term in the original information bottleneck Lagrangian into factual and counterfactual parts, we derive a contrastive information loss.
Empirical evaluation on real-world datasets shows that our CVIB significantly enhances both shallow and deep models.
- Score: 34.62042315265005
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Counterfactual learning for dealing with missing-not-at-random data (MNAR) is
an intriguing topic in the recommendation literature since MNAR data are
ubiquitous in modern recommender systems. Missing-at-random (MAR) data, namely
randomized controlled trials (RCTs), are usually required by most previous
counterfactual learning methods for debiasing learning. However, the execution
of RCTs is extraordinarily expensive in practice. To circumvent the use of
RCTs, we build an information-theoretic counterfactual variational information
bottleneck (CVIB), as an alternative for debiasing learning without RCTs. By
separating the task-aware mutual information term in the original information
bottleneck Lagrangian into factual and counterfactual parts, we derive a
contrastive information loss and an additional output confidence penalty, which
facilitates balanced learning between the factual and counterfactual domains.
Empirical evaluation on real-world datasets shows that our CVIB significantly
enhances both shallow and deep models, which sheds light on counterfactual
learning in recommendation that goes beyond RCTs.
Related papers
- AdvKT: An Adversarial Multi-Step Training Framework for Knowledge Tracing [64.79967583649407]
Knowledge Tracing (KT) monitors students' knowledge states and simulates their responses to question sequences.
Existing KT models typically follow a single-step training paradigm, which leads to significant error accumulation.
We propose a novel Adversarial Multi-Step Training Framework for Knowledge Tracing (AdvKT) which focuses on the multi-step KT task.
arXiv Detail & Related papers (2025-04-07T03:31:57Z) - An MRP Formulation for Supervised Learning: Generalized Temporal Difference Learning Models [20.314426291330278]
In traditional statistical learning, data points are usually assumed to be independently and identically distributed (i.i.d.)
This paper presents a contrasting viewpoint, perceiving data points as interconnected and employing a Markov reward process (MRP) for data modeling.
We reformulate the typical supervised learning as an on-policy policy evaluation problem within reinforcement learning (RL), introducing a generalized temporal difference (TD) learning algorithm as a resolution.
arXiv Detail & Related papers (2024-04-23T21:02:58Z) - Learning from Teaching Regularization: Generalizable Correlations Should be Easy to Imitate [40.5601980891318]
Generalization remains a central challenge in machine learning.
We propose Learning from Teaching (LoT), a novel regularization technique for deep neural networks to enhance generalization.
LoT operationalizes this concept to improve the generalization of the main model with auxiliary student learners.
arXiv Detail & Related papers (2024-02-05T07:05:17Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - STEERING: Stein Information Directed Exploration for Model-Based
Reinforcement Learning [111.75423966239092]
We propose an exploration incentive in terms of the integral probability metric (IPM) between a current estimate of the transition model and the unknown optimal.
Based on KSD, we develop a novel algorithm algo: textbfSTEin information dirtextbfEcted exploration for model-based textbfReinforcement LearntextbfING.
arXiv Detail & Related papers (2023-01-28T00:49:28Z) - Mutual Information Learned Classifiers: an Information-theoretic
Viewpoint of Training Deep Learning Classification Systems [9.660129425150926]
Cross entropy loss can easily lead us to find models which demonstrate severe overfitting behavior.
In this paper, we prove that the existing cross entropy loss minimization for training DNN classifiers essentially learns the conditional entropy of the underlying data distribution.
We propose a mutual information learning framework where we train DNN classifiers via learning the mutual information between the label and input.
arXiv Detail & Related papers (2022-10-03T15:09:19Z) - Doubly Robust Collaborative Targeted Learning for Recommendation on Data
Missing Not at Random [6.563595953273317]
In recommender systems, the feedback data received is always missing not at random (MNAR)
We propose bf DR-TMLE that effectively captures the merits of both error imputation-based (EIB) and doubly robust (DR) methods.
We also propose a novel RCT-free collaborative targeted learning algorithm for DR-TMLE, called bf DR-TMLE-TL
arXiv Detail & Related papers (2022-03-19T06:48:50Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z) - Provably Efficient Causal Reinforcement Learning with Confounded
Observational Data [135.64775986546505]
We study how to incorporate the dataset (observational data) collected offline, which is often abundantly available in practice, to improve the sample efficiency in the online setting.
We propose the deconfounded optimistic value iteration (DOVI) algorithm, which incorporates the confounded observational data in a provably efficient manner.
arXiv Detail & Related papers (2020-06-22T14:49:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.