Aggregation of Pareto optimal models
- URL: http://arxiv.org/abs/2112.04161v1
- Date: Wed, 8 Dec 2021 08:21:15 GMT
- Title: Aggregation of Pareto optimal models
- Authors: Hamed Hamze Bajgiran and Houman Owhadi
- Abstract summary: In statistical decision theory, a model is said to be optimal if no other model carries less risk for at least one state of nature while presenting no more risk for others.
This paper presents an answer in four logical steps.
We show that all rational/consistent aggregation rules must follow a generalization of hierarchical Bayesian modeling.
- Score: 0.8122270502556374
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In statistical decision theory, a model is said to be Pareto optimal (or
admissible) if no other model carries less risk for at least one state of
nature while presenting no more risk for others. How can you rationally
aggregate/combine a finite set of Pareto optimal models while preserving Pareto
efficiency? This question is nontrivial because weighted model averaging does
not, in general, preserve Pareto efficiency. This paper presents an answer in
four logical steps: (1) A rational aggregation rule should preserve Pareto
efficiency (2) Due to the complete class theorem, Pareto optimal models must be
Bayesian, i.e., they minimize a risk where the true state of nature is averaged
with respect to some prior. Therefore each Pareto optimal model can be
associated with a prior, and Pareto efficiency can be maintained by aggregating
Pareto optimal models through their priors. (3) A prior can be interpreted as a
preference ranking over models: prior $\pi$ prefers model A over model B if the
average risk of A is lower than the average risk of B. (4) A
rational/consistent aggregation rule should preserve this preference ranking:
If both priors $\pi$ and $\pi'$ prefer model A over model B, then the prior
obtained by aggregating $\pi$ and $\pi'$ must also prefer A over B. Under these
four steps, we show that all rational/consistent aggregation rules are as
follows: Give each individual Pareto optimal model a weight, introduce a weak
order/ranking over the set of Pareto optimal models, aggregate a finite set of
models S as the model associated with the prior obtained as the weighted
average of the priors of the highest-ranked models in S. This result shows that
all rational/consistent aggregation rules must follow a generalization of
hierarchical Bayesian modeling. Following our main result, we present
applications to Kernel smoothing, time-depreciating models, and voting
mechanisms.
Related papers
- General Preference Modeling with Preference Representations for Aligning Language Models [51.14207112118503]
We introduce preference representation learning, an approach that embeds responses into a latent space to capture intricate preference structures efficiently.
We also propose preference score-based General Preference Optimization (GPO), which generalizes reward-based reinforcement learning from human feedback.
Our method may enhance the alignment of foundation models with nuanced human values.
arXiv Detail & Related papers (2024-10-03T04:22:55Z) - Soft Preference Optimization: Aligning Language Models to Expert Distributions [40.84391304598521]
SPO is a method for aligning generative models, such as Large Language Models (LLMs), with human preferences.
SPO integrates preference loss with a regularization term across the model's entire output distribution.
We showcase SPO's methodology, its theoretical foundation, and its comparative advantages in simplicity, computational efficiency, and alignment precision.
arXiv Detail & Related papers (2024-04-30T19:48:55Z) - Accelerating Ensemble Error Bar Prediction with Single Models Fits [0.5249805590164902]
An ensemble of N models is approximately N times more computationally demanding compared to a single model when it is used for inference.
In this work, we explore fitting a single model to predicted ensemble error bar data, which allows us to estimate uncertainties without the need for a full ensemble.
arXiv Detail & Related papers (2024-04-15T16:10:27Z) - A Nested Weighted Tchebycheff Multi-Objective Bayesian Optimization
Approach for Flexibility of Unknown Utopia Estimation in Expensive Black-box
Design Problems [0.0]
In existing work, a weighted Tchebycheff MOBO approach has been demonstrated which attempts to estimate the unknown utopia in formulating acquisition function.
We propose a nested weighted Tchebycheff MOBO framework where we build a regression model selection procedure from an ensemble of models.
arXiv Detail & Related papers (2021-10-16T00:44:06Z) - Rationales for Sequential Predictions [117.93025782838123]
Sequence models are a critical component of modern NLP systems, but their predictions are difficult to explain.
We consider model explanations though rationales, subsets of context that can explain individual model predictions.
We propose an efficient greedy algorithm to approximate this objective.
arXiv Detail & Related papers (2021-09-14T01:25:15Z) - PSD Representations for Effective Probability Models [117.35298398434628]
We show that a recently proposed class of positive semi-definite (PSD) models for non-negative functions is particularly suited to this end.
We characterize both approximation and generalization capabilities of PSD models, showing that they enjoy strong theoretical guarantees.
Our results open the way to applications of PSD models to density estimation, decision theory and inference.
arXiv Detail & Related papers (2021-06-30T15:13:39Z) - A bandit-learning approach to multifidelity approximation [7.960229223744695]
Multifidelity approximation is an important technique in scientific computation and simulation.
We introduce a bandit-learning approach for leveraging data of varying fidelities to achieve precise estimates.
arXiv Detail & Related papers (2021-03-29T05:29:35Z) - On Statistical Efficiency in Learning [37.08000833961712]
We address the challenge of model selection to strike a balance between model fitting and model complexity.
We propose an online algorithm that sequentially expands the model complexity to enhance selection stability and reduce cost.
Experimental studies show that the proposed method has desirable predictive power and significantly less computational cost than some popular methods.
arXiv Detail & Related papers (2020-12-24T16:08:29Z) - On Exploiting Hitting Sets for Model Reconciliation [53.81101846598925]
In human-aware planning, a planning agent may need to provide an explanation to a human user on why its plan is optimal.
A popular approach to do this is called model reconciliation, where the agent tries to reconcile the differences in its model and the human's model.
We present a logic-based framework for model reconciliation that extends beyond the realm of planning.
arXiv Detail & Related papers (2020-12-16T21:25:53Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - On the Discrepancy between Density Estimation and Sequence Generation [92.70116082182076]
log-likelihood is highly correlated with BLEU when we consider models within the same family.
We observe no correlation between rankings of models across different families.
arXiv Detail & Related papers (2020-02-17T20:13:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.