Optimal Information Combining for Multi-Agent Systems Using Adaptive Bias Learning
- URL: http://arxiv.org/abs/2510.25793v1
- Date: Tue, 28 Oct 2025 21:52:33 GMT
- Title: Optimal Information Combining for Multi-Agent Systems Using Adaptive Bias Learning
- Authors: Siavash M. Alamouti, Fay Arjomandi,
- Abstract summary: Current approaches either ignore these biases, leading to suboptimal decisions, or require expensive calibration procedures that are often infeasible in practice.<n>This paper addresses the fundamental question: when can we learn and correct for these unknown biases to recover near-optimal performance?<n>We develop a theoretical framework that decomposes biases into learnable systematic components and irreducible components.<n>We show that systems with high learnability ratios can recover significant performance, while those with low learnability show minimal benefit.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern multi-agent systems ranging from sensor networks monitoring critical infrastructure to crowdsourcing platforms aggregating human intelligence can suffer significant performance degradation due to systematic biases that vary with environmental conditions. Current approaches either ignore these biases, leading to suboptimal decisions, or require expensive calibration procedures that are often infeasible in practice. This performance gap has real consequences: inaccurate environmental monitoring, unreliable financial predictions, and flawed aggregation of human judgments. This paper addresses the fundamental question: when can we learn and correct for these unknown biases to recover near-optimal performance, and when is such learning futile? We develop a theoretical framework that decomposes biases into learnable systematic components and irreducible stochastic components, introducing the concept of learnability ratio as the fraction of bias variance predictable from observable covariates. This ratio determines whether bias learning is worthwhile for a given system. We prove that the achievable performance improvement is fundamentally bounded by this learnability ratio, providing system designers with quantitative guidance on when to invest in bias learning versus simpler approaches. We present the Adaptive Bias Learning and Optimal Combining (ABLOC) algorithm, which iteratively learns bias-correcting transformations while optimizing combination weights through closedform solutions, guaranteeing convergence to these theoretical bounds. Experimental validation demonstrates that systems with high learnability ratios can recover significant performance (we achieved 40%-70% of theoretical maximum improvement in our examples), while those with low learnability show minimal benefit, validating our diagnostic criteria for practical deployment decisions.
Related papers
- QuAIL: Quality-Aware Inertial Learning for Robust Training under Data Corruption [7.630511612007769]
We present QuAIL, a quality-informed training mechanism that incorporates feature reliability priors directly into the learning process.<n>We show that QuAIL consistently improves average performance over neural baselines under both random and value-dependent corruption.
arXiv Detail & Related papers (2026-02-03T16:06:30Z) - Your Group-Relative Advantage Is Biased [74.57406620907797]
Group-based learning methods rely on group-relative advantage estimation to avoid learned critics.<n>In this work, we uncover a fundamental issue of group-based RL: the group-relative advantage estimator is inherently biased relative to the true (expected) advantage.<n>We propose History-Aware Adaptive Difficulty Weighting (HA-DW), an adaptive reweighting scheme that adjusts advantage estimates based on an evolving difficulty anchor and training dynamics.
arXiv Detail & Related papers (2026-01-13T13:03:15Z) - Online Matching via Reinforcement Learning: An Expert Policy Orchestration Strategy [5.913458789333235]
We propose a reinforcement learning (RL) approach that learns to orchestrate a set of such expert policies.<n>We establish both expectation and high-probability regret guarantees and derive a novel finite-time bias bound for temporal-difference learning.<n>Our results highlight how structured, adaptive learning can improve the modeling and management of complex resource allocation and decision-making processes.
arXiv Detail & Related papers (2025-10-07T23:26:16Z) - Optimizers Qualitatively Alter Solutions And We Should Leverage This [62.662640460717476]
Deep Neural Networks (DNNs) can not guarantee convergence to a unique global minimum of the loss when using only local information, such as SGD.<n>We argue that the community should aim at understanding the biases of already existing methods, as well as aim to build new DNNs with the explicit intent of inducing certain properties of the solution.
arXiv Detail & Related papers (2025-07-16T13:33:31Z) - Mitigating Bias in Facial Recognition Systems: Centroid Fairness Loss Optimization [9.537960917804993]
societal demand for fair AI systems has put pressure on the research community to develop predictive models that meet new fairness criteria.<n>In particular, the variability of the errors made by certain Facial Recognition (FR) systems across specific segments of the population compromises the deployment of the latter.<n>We propose a novel post-processing approach to improve the fairness of pre-trained FR models by optimizing a regression loss which acts on centroid-based scores.
arXiv Detail & Related papers (2025-04-27T22:17:44Z) - Can Uncertainty Quantification Improve Learned Index Benefit Estimation? [11.25347279227943]
Index tuning is crucial for optimizing database performance by selecting optimal indexes based on workload.<n>Traditional methods relying on what-if tools often suffer from inefficiency and inaccuracy.<n>We propose Beauty, the first uncertainty-aware framework that enhances learning-based models with uncertainty quantification.
arXiv Detail & Related papers (2024-10-23T10:23:53Z) - Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric.
We propose a single framework built on their equivalence in learning scenarios.
Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z) - Balanced Q-learning: Combining the Influence of Optimistic and
Pessimistic Targets [74.04426767769785]
We show that specific types of biases may be preferable, depending on the scenario.
We design a novel reinforcement learning algorithm, Balanced Q-learning, in which the target is modified to be a convex combination of a pessimistic and an optimistic term.
arXiv Detail & Related papers (2021-11-03T07:30:19Z) - Probabilistic robust linear quadratic regulators with Gaussian processes [73.0364959221845]
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design.
We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin.
arXiv Detail & Related papers (2021-05-17T08:36:18Z) - Accurate and Robust Feature Importance Estimation under Distribution
Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method.
We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z) - Recursive Experts: An Efficient Optimal Mixture of Learning Systems in
Dynamic Environments [0.0]
Sequential learning systems are used in a wide variety of problems from decision making to optimization.
The goal is to reach an objective by exploiting the temporal relation inherent to the nature's feedback (state)
We propose an efficient optimal mixture framework for general sequential learning systems.
arXiv Detail & Related papers (2020-09-19T15:02:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.