Related papers: Self-Augmented Mixture-of-Experts for QoS Prediction

Self-Augmented Mixture-of-Experts for QoS Prediction

URL: http://arxiv.org/abs/2601.11036v2
Date: Fri, 23 Jan 2026 03:28:20 GMT
Title: Self-Augmented Mixture-of-Experts for QoS Prediction
Authors: Kecheng Cai, Chao Peng, Chenyang Xu, Xia Chen, Yi Wang, Shuo Shi, Qiyuan Liang,
Abstract summary: Quality of Service (QoS) prediction is one of the most fundamental problems in service computing.<n>A key challenge in prediction is the inherent sparsity of user-service interactions.<n>We propose a self-augmented strategy that leverages a model's own predictions for iterative refinement.
Score: 9.607159299982559
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Quality of Service (QoS) prediction is one of the most fundamental problems in service computing and personalized recommendation. In the problem, there is a set of users and services, each associated with a set of descriptive features. Interactions between users and services produce feedback values, typically represented as numerical QoS metrics such as response time or availability. Given the observed feedback for a subset of user-service pairs, the goal is to predict the QoS values for the remaining pairs. A key challenge in QoS prediction is the inherent sparsity of user-service interactions, as only a small subset of feedback values is typically observed. To address this, we propose a self-augmented strategy that leverages a model's own predictions for iterative refinement. In particular, we partially mask the predicted values and feed them back into the model to predict again. Building on this idea, we design a self-augmented mixture-of-experts model, where multiple expert networks iteratively and collaboratively estimate QoS values. We find that the iterative augmentation process naturally aligns with the MoE architecture by enabling inter-expert communication: in the second round, each expert receives the first-round predictions and refines its output accordingly. Experiments on benchmark datasets show that our method outperforms existing baselines and achieves competitive results.

Related papers

P-GenRM: Personalized Generative Reward Model with Test-time User-based Scaling [66.55381105691818]
We propose P-GenRM, the first Personalized Generative Reward Model with test-time user-based scaling.<n>P-GenRM transforms preference signals into structured evaluation chains that derive adaptive personas and scoring rubrics.<n>It further clusters users into User Prototypes and introduces a dual-granularity scaling mechanism.
arXiv Detail & Related papers (2026-02-12T16:07:22Z)
Conv4Rec: A 1-by-1 Convolutional AutoEncoder for User Profiling through Joint Analysis of Implicit and Explicit Feedbacks [35.7275102787435]
We introduce a new convolutional AutoEncoder architecture for user modelling and recommendation tasks.<n>Our model is able to learn jointly from both the explicit ratings and the implicit information in the sampling pattern.<n>In experiments on several real-life datasets, we achieve state-of-the-art performance on both the implicit and explicit feedback prediction tasks.
arXiv Detail & Related papers (2025-09-09T08:25:11Z)
Uncovering the Limitations of Query Performance Prediction: Failures, Insights, and Implications for Selective Query Processing [3.463527836552468]
This paper provides a comprehensive evaluation of state-of-the-art QPPs (e.g. NQC, UQC)<n>We use diverse sparse rankers (BM25, DFree without and with query expansion) and hybrid or dense (SPLADE and ColBert) rankers and diverse test collections ROBUST, GOV2, WT10G, and MS MARCO.<n>Results show significant variability in predictors accuracy, with collections as the main factor and rankers next.
arXiv Detail & Related papers (2025-04-01T18:18:21Z)
Consistency Checks for Language Model Forecasters [54.62507816753479]
We measure the performance of forecasters in terms of the consistency of their predictions on different logically-related questions.<n>We build an automated evaluation system that generates a set of base questions, instantiates consistency checks from these questions, elicits predictions of the forecaster, and measures the consistency of the predictions.
arXiv Detail & Related papers (2024-12-24T16:51:35Z)
SureMap: Simultaneous Mean Estimation for Single-Task and Multi-Task Disaggregated Evaluation [75.56845750400116]
Disaggregated evaluation -- estimation of performance of a machine learning model on different subpopulations -- is a core task when assessing performance and group-fairness of AI systems. We develop SureMap that has high estimation accuracy for both multi-task and single-task disaggregated evaluations of blackbox models. Our method combines maximum a posteriori (MAP) estimation using a well-chosen prior together with cross-validation-free tuning via Stein's unbiased risk estimate (SURE)
arXiv Detail & Related papers (2024-11-14T17:53:35Z)
Satellite Streaming Video QoE Prediction: A Real-World Subjective Database and Network-Level Prediction Models [59.061552498630874]
We introduce the LIVE-Viasat Real-World Satellite QoE Database. This database consists of 179 videos recorded from real-world streaming services affected by various authentic distortion patterns. We demonstrate the usefulness of this unique new resource by evaluating the efficacy of QoE-prediction models on it. We also created a new model that maps the network parameters to predicted human perception scores, which can be used by ISPs to optimize the video streaming quality of their networks.
arXiv Detail & Related papers (2024-10-17T18:22:50Z)
GACL: Graph Attention Collaborative Learning for Temporal QoS Prediction [5.040979636805073]
We propose a novel Graph Collaborative Learning (GACL) framework for temporal prediction. It builds on a dynamic user-service graph to comprehensively model historical interactions. Experiments on the WS-DREAM dataset demonstrate that GACL significantly outperforms state-of-the-art methods for temporal prediction.
arXiv Detail & Related papers (2024-08-20T05:38:47Z)
Query Performance Prediction using Relevance Judgments Generated by Large Language Models [53.97064615557883]
We propose a new Query performance prediction (QPP) framework using automatically generated relevance judgments (QPP-GenRE)<n>QPP-GenRE decomposes QPP into independent subtasks of predicting relevance of each item in a ranked list to a given query.<n>We predict an item's relevance by using open-source large language models (LLMs) to ensure scientific relevance.
arXiv Detail & Related papers (2024-04-01T09:33:05Z)
ARRQP: Anomaly Resilient Real-time QoS Prediction Framework with Graph Convolution [0.16317061277456998]
We introduce a real-time prediction framework (called ARRQP) with a specific emphasis on improving resilience to anomalies in the data. ARRQP integrates both contextual information and collaborative insights, enabling a comprehensive understanding of user-service interactions. Results on the benchmark WS-DREAM dataset demonstrate the framework's effectiveness in achieving accurate and timely predictions.
arXiv Detail & Related papers (2023-09-22T04:37:51Z)
Test-time Collective Prediction [73.74982509510961]
Multiple parties in machine learning want to jointly make predictions on future test points. Agents wish to benefit from the collective expertise of the full set of agents, but may not be willing to release their data or model parameters. We explore a decentralized mechanism to make collective predictions at test time, leveraging each agent's pre-trained model.
arXiv Detail & Related papers (2021-06-22T18:29:58Z)
FES: A Fast Efficient Scalable QoS Prediction Framework [0.9176056742068814]
One of the primary objectives of designing a prediction algorithm is to achieve satisfactory prediction accuracy. The algorithm has to be faster in terms of prediction time so that it can be integrated into a real-time recommendation system. The existing algorithms on prediction often compromise on one goal while ensuring the others.
arXiv Detail & Related papers (2021-03-12T19:28:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.