Differential Voting: Loss Functions For Axiomatically Diverse Aggregation of Heterogeneous Preferences
- URL: http://arxiv.org/abs/2601.18824v1
- Date: Sun, 25 Jan 2026 03:59:51 GMT
- Title: Differential Voting: Loss Functions For Axiomatically Diverse Aggregation of Heterogeneous Preferences
- Authors: Zhiyu An, Duaa Nakshbandi, Wan Du,
- Abstract summary: Reinforcement learning from human feedback can be viewed as a form of voting, where the aggregation mechanism is defined by the loss function.<n>We introduce Differential Voting, a framework that constructs instance-wise, differentiable loss functions whose population-level optima provably correspond to classical voting rules.<n>Our analysis shows how design choices in loss geometry-such as margin sensitivity and boundary concentration-directly translate into normative aggregation behavior.
- Score: 6.3240435869587515
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning from human feedback (RLHF) implicitly aggregates heterogeneous human preferences into a single utility function, even though the underlying utilities of the participants are in practice diverse. Hence, RLHF can be viewed as a form of voting, where the aggregation mechanism is defined by the loss function. Although Arrow's Impossibility Theorem suggests that different mechanisms satisfy different sets of desirable axioms, most existing methods rely on a single aggregation principle, typically the Bradley-Terry-Luce (BTL) model, which corresponds to Borda count voting. This restricts the axiomatic properties of the learned reward and obscures the normative assumptions embedded in optimization. In this work, we introduce Differential Voting, a unifying framework that constructs instance-wise, differentiable loss functions whose population-level optima provably correspond to distinct classical voting rules. We develop differentiable surrogates for majority-based aggregation (BTL), Copeland, and Kemeny rules, and formally analyze their calibration properties, gradient fields, and limiting behavior as smoothing parameters vanish. For each loss, we establish consistency with the corresponding social choice rule and characterize the axioms it satisfies or violates. Our analysis shows how design choices in loss geometry-such as margin sensitivity and boundary concentration-directly translate into normative aggregation behavior. Differential Voting makes preference aggregation an explicit and controllable design choice in RLHF, enabling principled trade-offs between axiomatic guarantees and optimization stability. Code to reproduce our experiments is open-sourced.
Related papers
- How Sampling Shapes LLM Alignment: From One-Shot Optima to Iterative Dynamics [65.67654005892469]
We show that proper instance-dependent sampling can yield stronger ranking guarantees, while skewed on-policy sampling can induce excessive concentration under structured preferences.<n>We then analyze iterative alignment dynamics in which the learned policy feeds back into future sampling and reference policies.<n>Our theoretical insights extend to Direct Preference Optimization, indicating the phenomena we captured are common to a broader class of preference-alignment methods.
arXiv Detail & Related papers (2026-02-12T17:11:08Z) - Optimistic Feasible Search for Closed-Loop Fair Threshold Decision-Making [0.0]
We study online learning of a one-dimensional threshold policy from bandit feedback.<n>We propose Optimistic Feasible Search (OFS), a simple grid-based method that maintains confidence bounds for reward and constraint residuals.
arXiv Detail & Related papers (2025-12-26T10:44:40Z) - Reliable Optimization Under Noise in Quantum Variational Algorithms [0.05219568203653522]
We show that Variational Quantum Eigensolver is severely challenged by finite-shot sampling noise.<n>We identify adaptive metaheuristics as the most effective and resilient strategies.
arXiv Detail & Related papers (2025-11-11T14:21:43Z) - On the Theory of Conditional Feature Alignment for Unsupervised Domain-Adaptive Counting [27.44207520673983]
Object counting models suffer when deployed across domains with differing density variety.<n>We propose a theoretical framework of conditional feature alignment and provide a straightforward implementation.
arXiv Detail & Related papers (2025-06-20T16:37:48Z) - Fair Resource Allocation in Weakly Coupled Markov Decision Processes [3.824858358548714]
We consider fair resource allocation in sequential decision-making environments modeled as weakly coupled Markov decision processes.<n>We adopt a fairness definition using the generalized Gini function instead of the traditional utilitarian (total-sum) objective.
arXiv Detail & Related papers (2024-11-14T20:40:55Z) - Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer [52.09480867526656]
We identify the source of misalignment as a form of distributional shift and uncertainty in learning human preferences.<n>To mitigate overoptimization, we first propose a theoretical algorithm that chooses the best policy for an adversarially chosen reward model.<n>Using the equivalence between reward models and the corresponding optimal policy, the algorithm features a simple objective that combines a preference optimization loss and a supervised learning loss.
arXiv Detail & Related papers (2024-05-26T05:38:50Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - Obtaining Explainable Classification Models using Distributionally
Robust Optimization [12.511155426574563]
We study generalized linear models constructed using sets of feature value rules.
An inherent trade-off exists between rule set sparsity and its prediction accuracy.
We propose a new formulation to learn an ensemble of rule sets that simultaneously addresses these competing factors.
arXiv Detail & Related papers (2023-11-03T15:45:34Z) - Post-hoc Bias Scoring Is Optimal For Fair Classification [12.897626117694317]
We introduce a novel instance-level measure of bias, which we call bias score, and the modification rule is a simple linear rule on top of the finite amount of bias scores.
In the case of DP and EOp constraints, the modification rule is thresholding a single bias score, while in the case of EO constraints we are required to fit a linear modification rule with 2 parameters.
arXiv Detail & Related papers (2023-10-09T13:54:08Z) - Fairness via Adversarial Attribute Neighbourhood Robust Learning [49.93775302674591]
We propose a principled underlineRobust underlineAdversarial underlineAttribute underlineNeighbourhood (RAAN) loss to debias the classification head.
arXiv Detail & Related papers (2022-10-12T23:39:28Z) - Deconfounding Scores: Feature Representations for Causal Effect
Estimation with Weak Overlap [140.98628848491146]
We introduce deconfounding scores, which induce better overlap without biasing the target of estimation.
We show that deconfounding scores satisfy a zero-covariance condition that is identifiable in observed data.
In particular, we show that this technique could be an attractive alternative to standard regularizations.
arXiv Detail & Related papers (2021-04-12T18:50:11Z) - GroupifyVAE: from Group-based Definition to VAE-based Unsupervised
Representation Disentanglement [91.9003001845855]
VAE-based unsupervised disentanglement can not be achieved without introducing other inductive bias.
We address VAE-based unsupervised disentanglement by leveraging the constraints derived from the Group Theory based definition as the non-probabilistic inductive bias.
We train 1800 models covering the most prominent VAE-based models on five datasets to verify the effectiveness of our method.
arXiv Detail & Related papers (2021-02-20T09:49:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.