Related papers: Interpretable Personalization via Policy Learning with Linear Decision Boundaries

Interpretable Personalization via Policy Learning with Linear Decision Boundaries

URL: http://arxiv.org/abs/2003.07545v4
Date: Wed, 2 Nov 2022 22:02:12 GMT
Title: Interpretable Personalization via Policy Learning with Linear Decision Boundaries
Authors: Zhaonan Qu, Isabella Qian, Zhengyuan Zhou
Abstract summary: effective personalization of goods and services has become a core business for companies to improve revenues and maintain a competitive edge. This paper studies the personalization problem through the lens of policy learning. We propose a class of policies with linear decision boundaries and propose learning algorithms using tools from causal inference.
Score: 14.817218449140338
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the rise of the digital economy and an explosion of available information about consumers, effective personalization of goods and services has become a core business focus for companies to improve revenues and maintain a competitive edge. This paper studies the personalization problem through the lens of policy learning, where the goal is to learn a decision-making rule (a policy) that maps from consumer and product characteristics (features) to recommendations (actions) in order to optimize outcomes (rewards). We focus on using available historical data for offline learning with unknown data collection procedures, where a key challenge is the non-random assignment of recommendations. Moreover, in many business and medical applications, interpretability of a policy is essential. We study the class of policies with linear decision boundaries to ensure interpretability, and propose learning algorithms using tools from causal inference to address unbalanced treatments. We study several optimization schemes to solve the associated non-convex, non-smooth optimization problem, and find that a Bayesian optimization algorithm is effective. We test our algorithm with extensive simulation studies and apply it to an anonymized online marketplace customer purchase dataset, where the learned policy outputs a personalized discount recommendation based on customer and product features in order to maximize gross merchandise value (GMV) for sellers. Our learned policy improves upon the platform's baseline by 88.2\% in net sales revenue, while also providing informative insights on which features are important for the decision-making process. Our findings suggest that our proposed policy learning framework using tools from causal inference and Bayesian optimization provides a promising practical approach to interpretable personalization across a wide range of applications.

Related papers

OPO: Making Decision-Focused Data Acquisition Decisions [0.0]
We propose a model for making data acquisition decisions for variables in contextual optimisation problems. We solve the data acquisition problem with well-defined constraints by learning a surrogate linear objective function. We ablate the problem with a number of training modalities and demonstrate that the differentiable optimisation approach outperforms random search strategies.
arXiv Detail & Related papers (2025-04-21T12:41:35Z)
Learning Joint Models of Prediction and Optimization [56.04498536842065]
Predict-Then-Then framework uses machine learning models to predict unknown parameters of an optimization problem from features before solving. This paper proposes an alternative method, in which optimal solutions are learned directly from the observable features by joint predictive models.
arXiv Detail & Related papers (2024-09-07T19:52:14Z)
Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric. We propose a single framework built on their equivalence in learning scenarios. Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z)
Non-linear Welfare-Aware Strategic Learning [10.448052192725168]
This paper studies algorithmic decision-making in the presence of strategic individual behaviors. We first generalize the agent best response model in previous works to the non-linear setting. We show the three welfare can attain the optimum simultaneously only under restrictive conditions.
arXiv Detail & Related papers (2024-05-03T01:50:03Z)
End-to-End Learning for Fair Multiobjective Optimization Under Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality. This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives. It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z)
An explainable machine learning-based approach for analyzing customers' online data to identify the importance of product attributes [0.6437284704257459]
We propose a game theory machine learning (ML) method that extracts comprehensive design implications for product development. We apply our method to a real-world dataset of laptops from Kaggle, and derive design implications based on the results.
arXiv Detail & Related papers (2024-02-03T20:50:48Z)
Predict-Then-Optimize by Proxy: Learning Joint Models of Prediction and Optimization [59.386153202037086]
Predict-Then- framework uses machine learning models to predict unknown parameters of an optimization problem from features before solving. This approach can be inefficient and requires handcrafted, problem-specific rules for backpropagation through the optimization step. This paper proposes an alternative method, in which optimal solutions are learned directly from the observable features by predictive models.
arXiv Detail & Related papers (2023-11-22T01:32:06Z)
Optimizing Credit Limit Adjustments Under Adversarial Goals Using Reinforcement Learning [42.303733194571905]
We seek to find and automatize an optimal credit card limit adjustment policy by employing reinforcement learning techniques. Our research establishes a conceptual structure for applying reinforcement learning framework to credit limit adjustment.
arXiv Detail & Related papers (2023-06-27T16:10:36Z)
PASTA: Pessimistic Assortment Optimization [25.51792135903357]
We consider a class of assortment optimization problems in an offline data-driven setting. We propose an algorithm referred to as Pessimistic ASsortment opTimizAtion (PASTA) based on the principle of pessimism.
arXiv Detail & Related papers (2023-02-08T01:11:51Z)
Data-Driven Offline Decision-Making via Invariant Representation Learning [97.49309949598505]
offline data-driven decision-making involves synthesizing optimized decisions with no active interaction. A key challenge is distributional shift: when we optimize with respect to the input into a model trained from offline data, it is easy to produce an out-of-distribution (OOD) input that appears erroneously good. In this paper, we formulate offline data-driven decision-making as domain adaptation, where the goal is to make accurate predictions for the value of optimized decisions.
arXiv Detail & Related papers (2022-11-21T11:01:37Z)
Offline Policy Optimization with Eligible Actions [34.4530766779594]
offline policy optimization could have a large impact on many real-world decision-making problems. Importance sampling and its variants are a commonly used type of estimator in offline policy evaluation. We propose an algorithm to avoid this overfitting through a new per-state-neighborhood normalization constraint.
arXiv Detail & Related papers (2022-07-01T19:18:15Z)
Off-Policy Evaluation with Policy-Dependent Optimization Response [90.28758112893054]
We develop a new framework for off-policy evaluation with a textitpolicy-dependent linear optimization response. We construct unbiased estimators for the policy-dependent estimand by a perturbation method. We provide a general algorithm for optimizing causal interventions.
arXiv Detail & Related papers (2022-02-25T20:25:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.