Related papers: Deep Reinforcement Learning for Sequential Combinatorial Auctions

Deep Reinforcement Learning for Sequential Combinatorial Auctions

URL: http://arxiv.org/abs/2407.08022v1
Date: Wed, 10 Jul 2024 20:00:22 GMT
Title: Deep Reinforcement Learning for Sequential Combinatorial Auctions
Authors: Sai Srivatsa Ravindranath, Zhe Feng, Di Wang, Manzil Zaheer, Aranyak Mehta, David C. Parkes,
Abstract summary: Revenue-optimal auction design is a challenging problem with significant theoretical and practical implications. We propose a new reinforcement learning framework tailored for sequential auctions that leverages first-order gradients. Our approach achieves significant improvement in revenue over both analytical baselines and standard reinforcement learning algorithms.
Score: 40.89021064082742
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Revenue-optimal auction design is a challenging problem with significant theoretical and practical implications. Sequential auction mechanisms, known for their simplicity and strong strategyproofness guarantees, are often limited by theoretical results that are largely existential, except for certain restrictive settings. Although traditional reinforcement learning methods such as Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC) are applicable in this domain, they struggle with computational demands and convergence issues when dealing with large and continuous action spaces. In light of this and recognizing that we can model transitions differentiable for our settings, we propose using a new reinforcement learning framework tailored for sequential combinatorial auctions that leverages first-order gradients. Our extensive evaluations show that our approach achieves significant improvement in revenue over both analytical baselines and standard reinforcement learning algorithms. Furthermore, we scale our approach to scenarios involving up to 50 agents and 50 items, demonstrating its applicability in complex, real-world auction settings. As such, this work advances the computational tools available for auction design and contributes to bridging the gap between theoretical results and practical implementations in sequential auction design.

Related papers

Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections [65.36449542323277]
We present a unified theoretical framework bridgingSupervised Fine-Tuning (SFT) and preference learning in Large Language Model (LLM) post-training.<n>We propose a simple yet effective learning rate reduction approach that yields significant performance improvements.
arXiv Detail & Related papers (2025-06-15T05:42:29Z)
LARES: Latent Reasoning for Sequential Recommendation [96.26996622771593]
We present LARES, a novel and scalable LAtent REasoning framework for Sequential recommendation.<n>Our proposed approach employs a recurrent architecture that allows flexible expansion of reasoning depth without increasing parameter complexity.<n>Our framework exhibits seamless compatibility with existing advanced models, further improving their recommendation performance.
arXiv Detail & Related papers (2025-05-22T16:22:54Z)
Deep Learning for Double Auction [3.3799233225949576]
Finding an optimal auction mechanism is extremely difficult due to the constraints of imperfect information, incentive compatibility (IC), and individual rationality (IR) We develop deep learning methods for double auctions, where imperfect information exists on both the demand and supply sides. We achieve generalizability by leveraging a transformer-based architecture to model market participants as sequences for varying market sizes.
arXiv Detail & Related papers (2025-04-07T08:56:32Z)
Component-based Sketching for Deep ReLU Nets [55.404661149594375]
We develop a sketching scheme based on deep net components for various tasks. We transform deep net training into a linear empirical risk minimization problem. We show that the proposed component-based sketching provides almost optimal rates in approximating saturated functions.
arXiv Detail & Related papers (2024-09-21T15:30:43Z)
Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning [1.0762853848552156]
We implement over 60 different off-policy agents, each integrating established regularization techniques from recent state-of-the-art algorithms. We tested these agents across 14 diverse tasks from 2 simulation benchmarks, measuring training metrics related to overestimation, overfitting, and plasticity loss. A simple Soft Actor-Critic agent, appropriately regularized, reliably finds a better-performing policy within the training regime.
arXiv Detail & Related papers (2024-03-01T13:25:10Z)
Understanding Iterative Combinatorial Auction Designs via Multi-Agent Reinforcement Learning [10.41350502488723]
We investigate whether multi-agent reinforcement learning algorithms can be used to understand iterative auctions. We find that MARL can indeed benefit auction analysis, but that deploying it effectively is nontrivial. We illustrate the promise of our resulting approach by using it to evaluate a specific rule change to a clock auction.
arXiv Detail & Related papers (2024-02-29T18:16:13Z)
Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning [79.83792914684985]
We prove a new identifiability result that provides conditions under which maximally sparse base-predictors yield disentangled representations. Motivated by this theoretical result, we propose a practical approach to learn disentangled representations based on a sparsity-promoting bi-level optimization problem.
arXiv Detail & Related papers (2022-11-26T21:02:09Z)
Bayesian Optimization-based Combinatorial Assignment [10.73407470973258]
We study the assignment domain, which includes auctions and course allocation. The main challenge in this domain is that the bundle space grows exponentially in the number of items.
arXiv Detail & Related papers (2022-08-31T08:47:02Z)
A Context-Integrated Transformer-Based Neural Network for Auction Design [25.763612577196124]
One of the central problems in auction design is developing an incentive-compatible mechanism that maximizes the auctioneer's expected revenue. We propose $mathttCITransNet$, a context-integrated transformer-based neural network for optimal auction design. We show by extensive experiments that $mathttCITransNet$ can recover the known optimal solutions in single-item settings, outperform strong baselines in multi-item auctions, and generalize well to cases other than those in training.
arXiv Detail & Related papers (2022-01-29T03:47:00Z)
Efficient Performance Bounds for Primal-Dual Reinforcement Learning from Demonstrations [1.0609815608017066]
We consider large-scale Markov decision processes with an unknown cost function and address the problem of learning a policy from a finite set of expert demonstrations. Existing inverse reinforcement learning methods come with strong theoretical guarantees, but are computationally expensive. We introduce a novel bilinear saddle-point framework using Lagrangian duality to bridge the gap between theory and practice.
arXiv Detail & Related papers (2021-12-28T05:47:24Z)
Adversarial Robustness with Semi-Infinite Constrained Learning [177.42714838799924]
Deep learning to inputs perturbations has raised serious questions about its use in safety-critical domains. We propose a hybrid Langevin Monte Carlo training approach to mitigate this issue. We show that our approach can mitigate the trade-off between state-of-the-art performance and robust robustness.
arXiv Detail & Related papers (2021-10-29T13:30:42Z)
Off-Policy Imitation Learning from Observations [78.30794935265425]
Learning from Observations (LfO) is a practical reinforcement learning scenario from which many applications can benefit. We propose a sample-efficient LfO approach that enables off-policy optimization in a principled manner. Our approach is comparable with state-of-the-art locomotion in terms of both sample-efficiency and performance.
arXiv Detail & Related papers (2021-02-25T21:33:47Z)
Discrete Action On-Policy Learning with Action-Value Critic [72.20609919995086]
Reinforcement learning (RL) in discrete action space is ubiquitous in real-world applications, but its complexity grows exponentially with the action-space dimension. We construct a critic to estimate action-value functions, apply it on correlated actions, and combine these critic estimated action values to control the variance of gradient estimation. These efforts result in a new discrete action on-policy RL algorithm that empirically outperforms related on-policy algorithms relying on variance control techniques.
arXiv Detail & Related papers (2020-02-10T04:23:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.