MATT-CTR: Unleashing a Model-Agnostic Test-Time Paradigm for CTR Prediction with Confidence-Guided Inference Paths
- URL: http://arxiv.org/abs/2510.08932v1
- Date: Fri, 10 Oct 2025 02:22:55 GMT
- Title: MATT-CTR: Unleashing a Model-Agnostic Test-Time Paradigm for CTR Prediction with Confidence-Guided Inference Paths
- Authors: Moyu Zhang, Yun Chen, Yujun Jin, Jinxin Hu, Yu Zhang, Xiaoyi Zeng,
- Abstract summary: We propose a Model-Agnostic Test-Time paradigm (MATT) to unlock the predictive potential of trained CTR models.<n>To quantify the confidence of feature combinations, we introduce a hierarchical probabilistic hashing method.<n>We generate instance-specific inference paths through iterative sampling and aggregate the prediction scores from multiple paths to conduct robust predictions.
- Score: 9.542597285477683
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, a growing body of research has focused on either optimizing CTR model architectures to better model feature interactions or refining training objectives to aid parameter learning, thereby achieving better predictive performance. However, previous efforts have primarily focused on the training phase, largely neglecting opportunities for optimization during the inference phase. Infrequently occurring feature combinations, in particular, can degrade prediction performance, leading to unreliable or low-confidence outputs. To unlock the predictive potential of trained CTR models, we propose a Model-Agnostic Test-Time paradigm (MATT), which leverages the confidence scores of feature combinations to guide the generation of multiple inference paths, thereby mitigating the influence of low-confidence features on the final prediction. Specifically, to quantify the confidence of feature combinations, we introduce a hierarchical probabilistic hashing method to estimate the occurrence frequencies of feature combinations at various orders, which serve as their corresponding confidence scores. Then, using the confidence scores as sampling probabilities, we generate multiple instance-specific inference paths through iterative sampling and subsequently aggregate the prediction scores from multiple paths to conduct robust predictions. Finally, extensive offline experiments and online A/B tests strongly validate the compatibility and effectiveness of MATT across existing CTR models.
Related papers
- Symmetric Aggregation of Conformity Scores for Efficient Uncertainty Sets [6.673032375204486]
We propose SACP (Symmetric Aggregated Conformal Prediction), a novel method that aggregates nonconformity scores from multiple predictors.<n>SACP transforms these scores into e-values and combines them using any symmetric aggregation function.<n>We show that SACP consistently improves efficiency and often outperforms state-of-the-art model aggregation baselines.
arXiv Detail & Related papers (2025-12-07T17:54:07Z) - Infer As You Train: A Symmetric Paradigm of Masked Generative for Click-Through Rate Prediction [9.542597285477683]
Generative models are increasingly being explored in click-through rate (CTR) prediction field.<n>Existing generative models typically confine the generative paradigm to the training phase.<n>We propose the Symmetric Masked Generative Paradigm for CTR prediction (SGCTR)<n>SGCTR applies the generative capabilities during online inference to iteratively mitigate the features of input samples.
arXiv Detail & Related papers (2025-11-18T12:07:56Z) - DGenCTR: Towards a Universal Generative Paradigm for Click-Through Rate Prediction via Discrete Diffusion [6.189010741030871]
We propose a two-stage Discrete Diffusion-Based Generative CTR training framework (DGenCTR)<n>This two-stage framework comprises a diffusion-based generative pre-training stage and a CTR-targeted supervised fine-tuning stage for CTR.
arXiv Detail & Related papers (2025-08-20T07:42:21Z) - Feature Fitted Online Conformal Prediction for Deep Time Series Forecasting Model [0.8287206589886881]
Time series forecasting is critical for many applications, where deep learning-based point prediction models have demonstrated strong performance.<n>Existing confidence interval modeling approaches suffer from key limitations.<n>We propose a lightweight predictoral prediction method that provides valid coverage and shorter interval lengths without retraining.
arXiv Detail & Related papers (2025-05-13T01:33:53Z) - FAST: Boosting Uncertainty-based Test Prioritization Methods for Neural Networks via Feature Selection [29.20073572683383]
We propose FAST, a method that boosts existing prioritization methods through guided FeAture SelecTion.
FAST is based on the insight that certain features may introduce noise that affects the model's output confidence.
It quantifies the importance of each feature for the model's correct predictions, and then dynamically prunes the information from the noisy features.
arXiv Detail & Related papers (2024-09-13T18:13:09Z) - From Conformal Predictions to Confidence Regions [1.4272411349249627]
We introduce CCR, which employs a combination of conformal prediction intervals for the model outputs to establish confidence regions for model parameters.
We present coverage guarantees under minimal assumptions on noise and that is valid in finite sample regime.
Our approach is applicable to both split conformal predictions and black-box methodologies including full or cross-conformal approaches.
arXiv Detail & Related papers (2024-05-28T21:33:12Z) - When Rigidity Hurts: Soft Consistency Regularization for Probabilistic
Hierarchical Time Series Forecasting [69.30930115236228]
Probabilistic hierarchical time-series forecasting is an important variant of time-series forecasting.
Most methods focus on point predictions and do not provide well-calibrated probabilistic forecasts distributions.
We propose PROFHiT, a fully probabilistic hierarchical forecasting model that jointly models forecast distribution of entire hierarchy.
arXiv Detail & Related papers (2023-10-17T20:30:16Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - When Does Confidence-Based Cascade Deferral Suffice? [69.28314307469381]
Cascades are a classical strategy to enable inference cost to vary adaptively across samples.
A deferral rule determines whether to invoke the next classifier in the sequence, or to terminate prediction.
Despite being oblivious to the structure of the cascade, confidence-based deferral often works remarkably well in practice.
arXiv Detail & Related papers (2023-07-06T04:13:57Z) - Collaborative Uncertainty Benefits Multi-Agent Multi-Modal Trajectory Forecasting [61.02295959343446]
This work first proposes a novel concept, collaborative uncertainty (CU), which models the uncertainty resulting from interaction modules.<n>We build a general CU-aware regression framework with an original permutation-equivariant uncertainty estimator to do both tasks of regression and uncertainty estimation.<n>We apply the proposed framework to current SOTA multi-agent trajectory forecasting systems as a plugin module.
arXiv Detail & Related papers (2022-07-11T21:17:41Z) - BERT Loses Patience: Fast and Robust Inference with Early Exit [91.26199404912019]
We propose Patience-based Early Exit as a plug-and-play technique to improve the efficiency and robustness of a pretrained language model.
Our approach improves inference efficiency as it allows the model to make a prediction with fewer layers.
arXiv Detail & Related papers (2020-06-07T13:38:32Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.