Bayesian Meta-Prior Learning Using Empirical Bayes
- URL: http://arxiv.org/abs/2002.01129v3
- Date: Mon, 12 Jul 2021 21:18:32 GMT
- Title: Bayesian Meta-Prior Learning Using Empirical Bayes
- Authors: Sareh Nabi, Houssam Nassif, Joseph Hong, Hamed Mamani, Guido Imbens
- Abstract summary: We propose a hierarchical Empirical Bayes approach that addresses the absence of informative priors, and the inability to control parameter learning rates.
Our method learns empirical meta-priors from the data itself and uses them to decouple the learning rates of first-order and second-order features.
Our findings are promising, as optimizing over sparse data is often a challenge.
- Score: 3.666114237131823
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adding domain knowledge to a learning system is known to improve results. In
multi-parameter Bayesian frameworks, such knowledge is incorporated as a prior.
On the other hand, various model parameters can have different learning rates
in real-world problems, especially with skewed data. Two often-faced challenges
in Operation Management and Management Science applications are the absence of
informative priors, and the inability to control parameter learning rates. In
this study, we propose a hierarchical Empirical Bayes approach that addresses
both challenges, and that can generalize to any Bayesian framework. Our method
learns empirical meta-priors from the data itself and uses them to decouple the
learning rates of first-order and second-order features (or any other given
feature grouping) in a Generalized Linear Model. As the first-order features
are likely to have a more pronounced effect on the outcome, focusing on
learning first-order weights first is likely to improve performance and
convergence time. Our Empirical Bayes method clamps features in each group
together and uses the deployed model's observed data to empirically compute a
hierarchical prior in hindsight. We report theoretical results for the
unbiasedness, strong consistency, and optimal frequentist cumulative regret
properties of our meta-prior variance estimator. We apply our method to a
standard supervised learning optimization problem, as well as an online
combinatorial optimization problem in a contextual bandit setting implemented
in an Amazon production system. Both during simulations and live experiments,
our method shows marked improvements, especially in cases of small traffic. Our
findings are promising, as optimizing over sparse data is often a challenge.
Related papers
- Pre-training helps Bayesian optimization too [49.28382118032923]
We seek an alternative practice for setting functional priors.
In particular, we consider the scenario where we have data from similar functions that allow us to pre-train a tighter distribution a priori.
Our results show that our method is able to locate good hyper parameters at least 3 times more efficiently than the best competing methods.
arXiv Detail & Related papers (2022-07-07T04:42:54Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - On Modality Bias Recognition and Reduction [70.69194431713825]
We study the modality bias problem in the context of multi-modal classification.
We propose a plug-and-play loss function method, whereby the feature space for each label is adaptively learned.
Our method yields remarkable performance improvements compared with the baselines.
arXiv Detail & Related papers (2022-02-25T13:47:09Z) - A Lagrangian Duality Approach to Active Learning [119.36233726867992]
We consider the batch active learning problem, where only a subset of the training data is labeled.
We formulate the learning problem using constrained optimization, where each constraint bounds the performance of the model on labeled samples.
We show, via numerical experiments, that our proposed approach performs similarly to or better than state-of-the-art active learning methods.
arXiv Detail & Related papers (2022-02-08T19:18:49Z) - Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise
Learning [65.54757265434465]
Pairwise learning refers to learning tasks where the loss function depends on a pair instances.
Online descent (OGD) is a popular approach to handle streaming data in pairwise learning.
In this paper, we propose simple and online descent to methods for pairwise learning.
arXiv Detail & Related papers (2021-11-23T18:10:48Z) - Last Layer Marginal Likelihood for Invariance Learning [12.00078928875924]
We introduce a new lower bound to the marginal likelihood, which allows us to perform inference for a larger class of likelihood functions.
We work towards bringing this approach to neural networks by using an architecture with a Gaussian process in the last layer.
arXiv Detail & Related papers (2021-06-14T15:40:51Z) - Deep Optimized Priors for 3D Shape Modeling and Reconstruction [38.79018852887249]
We introduce a new learning framework for 3D modeling and reconstruction.
We show that the proposed strategy effectively breaks the barriers constrained by the pre-trained priors.
arXiv Detail & Related papers (2020-12-14T03:56:31Z) - Fast Few-Shot Classification by Few-Iteration Meta-Learning [173.32497326674775]
We introduce a fast optimization-based meta-learning method for few-shot classification.
Our strategy enables important aspects of the base learner objective to be learned during meta-training.
We perform a comprehensive experimental analysis, demonstrating the speed and effectiveness of our approach.
arXiv Detail & Related papers (2020-10-01T15:59:31Z) - A Primal-Dual Subgradient Approachfor Fair Meta Learning [23.65344558042896]
Few shot meta-learning is well-known with its fast-adapted capability and accuracy generalization onto unseen tasks.
We propose a Primal-Dual Fair Meta-learning framework, namely PDFM, which learns to train fair machine learning models using only a few examples.
arXiv Detail & Related papers (2020-09-26T19:47:38Z) - A Review of Meta-level Learning in the Context of Multi-component,
Multi-level Evolving Prediction Systems [6.810856082577402]
The exponential growth of volume, variety and velocity of data is raising the need for investigations of automated or semi-automated ways to extract useful patterns from the data.
It requires deep expert knowledge and extensive computational resources to find the most appropriate mapping of learning methods for a given problem.
There is a need for an intelligent recommendation engine that can advise what is the best learning algorithm for a dataset.
arXiv Detail & Related papers (2020-07-17T14:14:37Z) - Monotonic Cardinality Estimation of Similarity Selection: A Deep
Learning Approach [22.958342743597044]
We investigate the possibilities of utilizing deep learning for cardinality estimation of similarity selection.
We propose a novel and generic method that can be applied to any data type and distance function.
arXiv Detail & Related papers (2020-02-15T20:22:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.