A Predict-Then-Optimize Customer Allocation Framework for Online Fund Recommendation
- URL: http://arxiv.org/abs/2503.03165v1
- Date: Wed, 05 Mar 2025 04:16:36 GMT
- Title: A Predict-Then-Optimize Customer Allocation Framework for Online Fund Recommendation
- Authors: Xing Tang, Yunpeng Weng, Fuyuan Lyu, Dugang Liu, Xiuqiang He,
- Abstract summary: The central issue is to match funds with potential customers under constraints.<n>Traditional recommendation regime has its inherent drawbacks when applying the fund-matching problem with multiple constraints.<n>We design PTOFA, a Predict-Then- Fund Allocation framework, which aims to predict expected revenue based on customer behavior.
- Score: 11.645776345951134
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rapid growth of online investment platforms, funds can be distributed to individual customers online. The central issue is to match funds with potential customers under constraints. Most mainstream platforms adopt the recommendation formulation to tackle the problem. However, the traditional recommendation regime has its inherent drawbacks when applying the fund-matching problem with multiple constraints. In this paper, we model the fund matching under the allocation formulation. We design PTOFA, a Predict-Then-Optimize Fund Allocation framework. This data-driven framework consists of two stages, i.e., prediction and optimization, which aim to predict expected revenue based on customer behavior and optimize the impression allocation to achieve the maximum revenue under the necessary constraints, respectively. Extensive experiments on real-world datasets from an industrial online investment platform validate the effectiveness and efficiency of our solution. Additionally, the online A/B tests demonstrate PTOFA's effectiveness in the real-world fund recommendation scenario.
Related papers
- A Survey of Direct Preference Optimization [103.59317151002693]
Large Language Models (LLMs) have demonstrated unprecedented generative capabilities.
Their alignment with human values remains critical for ensuring helpful and harmless deployments.
Direct Preference Optimization (DPO) has recently gained prominence as a streamlined alternative.
arXiv Detail & Related papers (2025-03-12T08:45:15Z) - Federated Fine-Tuning of Large Language Models: Kahneman-Tversky vs. Direct Preference Optimization [49.88778604259453]
We evaluate Kahneman-Tversky Optimization (KTO) as a fine-tuning method for large language models (LLMs) in federated learning (FL) settings.
In both its original (KTOO) and redistributed (KTOR) configurations, KTO consistently outperforms DPO across all benchmarks.
These findings establish KTO as a robust and scalable fine-tuning method for FL, motivating its adoption for privacy-preserving, decentralized, and heterogeneous environments.
arXiv Detail & Related papers (2025-02-20T01:44:21Z) - Demystifying Domain-adaptive Post-training for Financial LLMs [79.581577578952]
FINDAP is a systematic and fine-grained investigation into domain adaptive post-training of large language models (LLMs)
Our approach consists of four key components: FinCap, FinRec, FinTrain and FinEval.
The resulting model, Llama-Fin, achieves state-of-the-art performance across a wide range of financial tasks.
arXiv Detail & Related papers (2025-01-09T04:26:15Z) - Joint Resource Optimization, Computation Offloading and Resource Slicing for Multi-Edge Traffic-Cognitive Networks [0.0]
This paper investigates a multi-agent system where both the platform and ESs are self-interested entities.
We propose a novel Stackelberg game-based framework to model interactions between stakeholders and solve the optimization problem.
We further design a decentralized solution leveraging neural network optimization and a privacy-preserving information exchange protocol.
arXiv Detail & Related papers (2024-11-26T11:51:10Z) - $α$-DPO: Adaptive Reward Margin is What Direct Preference Optimization Needs [45.46582930202524]
$alpha$-DPO is an adaptive preference optimization algorithm for large language models.
It balances the policy model and the reference model to achieve personalized reward margins.
It consistently outperforms DPO and SimPO across various model settings.
arXiv Detail & Related papers (2024-10-14T04:29:57Z) - End-to-End Cost-Effective Incentive Recommendation under Budget Constraint with Uplift Modeling [12.160403526724476]
We propose a novel End-to-End Cost-Effective Incentive Recommendation (E3IR) model under budget constraints.
Specifically, our methods consist of two modules, i.e., the uplift prediction module and the differentiable allocation module.
Our E3IR improves allocation performance compared to existing two-stage approaches.
arXiv Detail & Related papers (2024-08-21T13:48:00Z) - Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric.
We propose a single framework built on their equivalence in learning scenarios.
Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z) - Benchmarking PtO and PnO Methods in the Predictive Combinatorial Optimization Regime [59.27851754647913]
Predictive optimization is the precise modeling of many real-world applications, including energy cost-aware scheduling and budget allocation on advertising.
We develop a modular framework to benchmark 11 existing PtO/PnO methods on 8 problems, including a new industrial dataset for advertising.
Our study shows that PnO approaches are better than PtO on 7 out of 8 benchmarks, but there is no silver bullet found for the specific design choices of PnO.
arXiv Detail & Related papers (2023-11-13T13:19:34Z) - ZooPFL: Exploring Black-box Foundation Models for Personalized Federated
Learning [95.64041188351393]
This paper endeavors to solve both the challenges of limited resources and personalization.
We propose a method named ZOOPFL that uses Zeroth-Order Optimization for Personalized Federated Learning.
To reduce the computation costs and enhance personalization, we propose input surgery to incorporate an auto-encoder with low-dimensional and client-specific embeddings.
arXiv Detail & Related papers (2023-10-08T12:26:13Z) - Optimizing Credit Limit Adjustments Under Adversarial Goals Using
Reinforcement Learning [42.303733194571905]
We seek to find and automatize an optimal credit card limit adjustment policy by employing reinforcement learning techniques.
Our research establishes a conceptual structure for applying reinforcement learning framework to credit limit adjustment.
arXiv Detail & Related papers (2023-06-27T16:10:36Z) - Online Learning under Budget and ROI Constraints via Weak Adaptivity [57.097119428915796]
Existing primal-dual algorithms for constrained online learning problems rely on two fundamental assumptions.
We show how such assumptions can be circumvented by endowing standard primal-dual templates with weakly adaptive regret minimizers.
We prove the first best-of-both-worlds no-regret guarantees which hold in absence of the two aforementioned assumptions.
arXiv Detail & Related papers (2023-02-02T16:30:33Z) - LBCF: A Large-Scale Budget-Constrained Causal Forest Algorithm [11.82503645248441]
How to select the right amount of incentives (i.e. treatment) to each user under budget constraints is an important research problem.
We propose a novel tree-based treatment selection technique under budget constraints, called Large-Scale Budget-Constrained Causal Forest (LBCF) algorithm.
We deploy our approach in a real-world scenario on a large-scale video platform, where the platform gives away bonuses in order to increase users' campaign engagement duration.
arXiv Detail & Related papers (2022-01-29T13:21:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.