Related papers: C2:Cross learning module enhanced decision transformer with Constraint-aware loss for auto-bidding

C2:Cross learning module enhanced decision transformer with Constraint-aware loss for auto-bidding

URL: http://arxiv.org/abs/2601.20257v2
Date: Thu, 29 Jan 2026 10:08:33 GMT
Title: C2:Cross learning module enhanced decision transformer with Constraint-aware loss for auto-bidding
Authors: Jinren Ding, Xuejian Xu, Shen Jiang, Zhitong Hao, Jinhui Yang, Peng Jiang,
Abstract summary: Decision Transformer (DT) shows promise for generative auto-bidding by capturing temporal dependencies.<n>DT suffers from insufficient cross-correlation modeling among state, action, and return-to-go sequences.<n>We propose C2, a novel framework enhancing DT with two core innovations.
Score: 9.446373834962895
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Decision Transformer (DT) shows promise for generative auto-bidding by capturing temporal dependencies, but suffers from two critical limitations: insufficient cross-correlation modeling among state, action, and return-to-go (RTG) sequences, and indiscriminate learning of optimal/suboptimal behaviors. To address these, we propose C2, a novel framework enhancing DT with two core innovations: (1) a Cross Learning Block (CLB) via cross-attention to strengthen inter-sequence correlation modeling; (2) a Constraint-aware Loss (CL) incorporating budget and Cost-Per-Acquisition (CPA) constraints for selective learning of optimal trajectories. Extensive offline evaluations on the AuctionNet dataset demonstrate consistent performance gains (up to 3.2% over state-of-the-art method) across diverse budget settings; ablation studies verify the complementary synergy of CLB and CL, confirming C2's superiority in auto-bidding. The code for reproducing our results is available at: https://github.com/Dingjinren/C2.

Related papers

Constraint-Aware Generative Auto-bidding via Pareto-Prioritized Regret Optimization [8.514099612407062]
PRO-Bid is a constraint-aware generative auto-bidding framework based on two synergistic mechanisms.<n>It achieves superior constraint satisfaction and value acquisition compared to state-of-the-art baselines.
arXiv Detail & Related papers (2026-02-09T04:41:30Z)
QASA: Quality-Guided K-Adaptive Slot Attention for Unsupervised Object-Centric Learning [80.82392186401354]
Slot Attention is an approach that binds different objects in a scene to a set of "slots"<n>Previous K-adaptive methods do not explicitly constrain slot-binding quality.<n>We propose Quality-Guided K-Adaptive Slot Attention (QASA)
arXiv Detail & Related papers (2026-01-19T10:42:07Z)
Randomized Neural Network with Adaptive Forward Regularization for Online Task-free Class Incremental Learning [16.323995111105884]
We propose a neural network (NN) with forward regularization (-F) to resist forgetting and enhance learning performance.<n>We derive the algorithm of the ensemble deep random vector functional link network (edRVFL) with adjustable forward regularization (-kF)<n>edRVFL-kF generates one-pass closed-form incremental updates and variable learning rates, effectively avoiding past replay and catastrophic forgetting.<n>We improve it to the plug-and-play edRVFL-kF-Bayes, enabling all hard ks in multiple sub-learners to be self-adaptively determined.
arXiv Detail & Related papers (2025-10-24T11:50:13Z)
Hierarchical Self-Supervised Representation Learning for Depression Detection from Speech [51.14752758616364]
Speech-based depression detection (SDD) is a promising, non-invasive alternative to traditional clinical assessments.<n>We propose HAREN-CTC, a novel architecture that integrates multi-layer SSL features using cross-attention within a multitask learning framework.<n>The model achieves state-of-the-art macro F1-scores of 0.81 on DAIC-WOZ and 0.82 on MODMA, outperforming prior methods across both evaluation scenarios.
arXiv Detail & Related papers (2025-10-05T09:32:12Z)
Boundary-to-Region Supervision for Offline Safe Reinforcement Learning [56.150983204962735]
Boundary-to-Region (B2R) is a framework that enables asymmetric conditioning through cost signal realignment.<n>B2R redefines CTG as a boundary constraint under a fixed safety budget, unifying the cost distribution of all feasible trajectories.<n> Experimental results show that B2R satisfies safety constraints in 35 out of 38 safety-critical tasks.
arXiv Detail & Related papers (2025-09-30T03:38:20Z)
PT$^2$-LLM: Post-Training Ternarization for Large Language Models [52.4629647715623]
Large Language Models (LLMs) have shown impressive capabilities across diverse tasks, but their large memory and compute demands hinder deployment.<n>We propose PT$2$-LLM, a post-training ternarization framework tailored for LLMs.<n>At its core is an Asymmetric Ternary Quantizer equipped with a two-stage refinement pipeline.
arXiv Detail & Related papers (2025-09-27T03:01:48Z)
Sycophancy Mitigation Through Reinforcement Learning with Uncertainty-Aware Adaptive Reasoning Trajectories [58.988535279557546]
We introduce textbf sycophancy Mitigation through Adaptive Reasoning Trajectories.<n>We show that SMART significantly reduces sycophantic behavior while preserving strong performance on out-of-distribution inputs.
arXiv Detail & Related papers (2025-09-20T17:09:14Z)
Generative Bid Shading in Real-Time Bidding Advertising [7.7746704524695485]
This paper introduces Generative Bid Shading(GBS) as an end-to-end generative model.<n>It incorporates an autoregressive approach to generate ratios by capturing stepwise residual reward models.<n>It has been deployed on the Meit platform serving billions of bid requests daily.
arXiv Detail & Related papers (2025-08-06T03:34:49Z)
Causal Disentanglement and Cross-Modal Alignment for Enhanced Few-Shot Learning [11.752632557524969]
Causal CLIP Adapter (CCA) is a novel framework that explicitly disentangles visual features extracted from CLIP.<n>Our method consistently outperforms state-of-the-art approaches in terms of few-shot performance and robustness to distributional shifts.
arXiv Detail & Related papers (2025-08-05T05:30:42Z)
Fast Second-Order Online Kernel Learning through Incremental Matrix Sketching and Decomposition [44.61147231796296]
Online Learning (OKL) has attracted considerable research interest due to its promising predictive performance in streaming environments.<n>Existing second-order OKL approaches suffer from at least quadratic time complexity with respect to the pre-set budget.<n>We propose FORKS, a fast incremental matrix sketching and decomposition approach tailored for second-order OKL.
arXiv Detail & Related papers (2024-10-15T02:07:48Z)
DELTA: Dynamic Embedding Learning with Truncated Conscious Attention for CTR Prediction [61.68415731896613]
Click-Through Rate (CTR) prediction is a pivotal task in product and content recommendation. We propose a model that enables Dynamic Embedding Learning with Truncated Conscious Attention for CTR prediction.
arXiv Detail & Related papers (2023-05-03T12:34:45Z)
Visual Alignment Constraint for Continuous Sign Language Recognition [74.26707067455837]
Vision-based Continuous Sign Language Recognition aims to recognize unsegmented gestures from image sequences. In this work, we revisit the overfitting problem in recent CTC-based CSLR works and attribute it to the insufficient training of the feature extractor. We propose a Visual Alignment Constraint (VAC) to enhance the feature extractor with more alignment supervision.
arXiv Detail & Related papers (2021-04-06T07:24:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.