A Bayesian approach to multi-task learning with network lasso
- URL: http://arxiv.org/abs/2110.09040v1
- Date: Mon, 18 Oct 2021 06:25:38 GMT
- Title: A Bayesian approach to multi-task learning with network lasso
- Authors: Kaito Shimamura, Shuichi Kawano
- Abstract summary: We propose a Bayesian approach to solve multi-task learning problems by network lasso.
The effectiveness of the proposed method is shown in a simulation study and a real data analysis.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Network lasso is a method for solving a multi-task learning problem through
the regularized maximum likelihood method. A characteristic of network lasso is
setting a different model for each sample. The relationships among the models
are represented by relational coefficients. A crucial issue in network lasso is
to provide appropriate values for these relational coefficients. In this paper,
we propose a Bayesian approach to solve multi-task learning problems by network
lasso. This approach allows us to objectively determine the relational
coefficients by Bayesian estimation. The effectiveness of the proposed method
is shown in a simulation study and a real data analysis.
Related papers
- MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation [80.47072100963017]
We introduce a novel and low-compute algorithm, Model Merging with Amortized Pareto Front (MAP)
MAP efficiently identifies a set of scaling coefficients for merging multiple models, reflecting the trade-offs involved.
We also introduce Bayesian MAP for scenarios with a relatively low number of tasks and Nested MAP for situations with a high number of tasks, further reducing the computational cost of evaluation.
arXiv Detail & Related papers (2024-06-11T17:55:25Z) - Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions.
We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training.
As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z) - Online Learning of Network Bottlenecks via Minimax Paths [6.316693022958221]
We study bottleneck identification in networks via extracting minimax paths.
We then devise an alternative problem formulation which approximates the original objective.
We experimentally evaluate the performance of Thompson Sampling with the approximate formulation on real-world directed and undirected networks.
arXiv Detail & Related papers (2021-09-17T11:11:50Z) - Probabilistic task modelling for meta-learning [44.072592379328036]
We propose a generative probabilistic model for collections of tasks used in meta-learning.
The proposed model combines variational auto-encoding and latent Dirichlet allocation to model each task as a mixture of Gaussian distribution in an embedding space.
arXiv Detail & Related papers (2021-06-09T04:34:12Z) - Network Estimation by Mixing: Adaptivity and More [2.3478438171452014]
We propose a mixing strategy that leverages available arbitrary models to improve their individual performances.
The proposed method is computationally efficient and almost tuning-free.
We show that the proposed method performs equally well as the oracle estimate when the true model is included as individual candidates.
arXiv Detail & Related papers (2021-06-05T05:17:04Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z) - Simultaneous Perturbation Stochastic Approximation for Few-Shot Learning [0.5801044612920815]
We propose a prototypical-like few-shot learning approach based on the prototypical networks method.
The results of experiments on the benchmark dataset demonstrate that the proposed method is superior to the original networks.
arXiv Detail & Related papers (2020-06-09T09:47:58Z) - Continual Learning using a Bayesian Nonparametric Dictionary of Weight
Factors [75.58555462743585]
Naively trained neural networks tend to experience catastrophic forgetting in sequential task settings.
We propose a principled nonparametric approach based on the Indian Buffet Process (IBP) prior, letting the data determine how much to expand the model complexity.
We demonstrate the effectiveness of our method on a number of continual learning benchmarks and analyze how weight factors are allocated and reused throughout the training.
arXiv Detail & Related papers (2020-04-21T15:20:19Z) - Deep Unfolding Network for Image Super-Resolution [159.50726840791697]
This paper proposes an end-to-end trainable unfolding network which leverages both learning-based methods and model-based methods.
The proposed network inherits the flexibility of model-based methods to super-resolve blurry, noisy images for different scale factors via a single model.
arXiv Detail & Related papers (2020-03-23T17:55:42Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z) - A Tutorial on Learning With Bayesian Networks [8.98526174345299]
A Bayesian network is a graphical model that encodes probabilistic relationships among variables of interest.
A Bayesian network can be used to learn causal relationships.
It can also be used to gain understanding about a problem domain and to predict the consequences of intervention.
arXiv Detail & Related papers (2020-02-01T20:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.