Train 'n Trade: Foundations of Parameter Markets
- URL: http://arxiv.org/abs/2312.04740v1
- Date: Thu, 7 Dec 2023 22:50:24 GMT
- Title: Train 'n Trade: Foundations of Parameter Markets
- Authors: Tzu-Heng Huang, Harit Vishwakarma, Frederic Sala
- Abstract summary: We propose a framework containing the infrastructure necessary for market operations to take place.
We show that it is possible to mutually gain by using the market, even in competitive settings.
- Score: 18.002561163904684
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Organizations typically train large models individually. This is costly and
time-consuming, particularly for large-scale foundation models. Such vertical
production is known to be suboptimal. Inspired by this economic insight, we ask
whether it is possible to leverage others' expertise by trading the constituent
parts in models, i.e., sets of weights, as if they were market commodities.
While recent advances in aligning and interpolating models suggest that doing
so may be possible, a number of fundamental questions must be answered to
create viable parameter markets. In this work, we address these basic
questions, propose a framework containing the infrastructure necessary for
market operations to take place, study strategies for exchanging parameters,
and offer means for agents to monetize parameters. Excitingly, compared to
agents who train siloed models from scratch, we show that it is possible to
mutually gain by using the market, even in competitive settings. This suggests
that the notion of parameter markets may be a useful paradigm for improving
large-scale model training in the future.
Related papers
- TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters [102.1116808722299]
We introduce TokenFormer, a scalable architecture for scaling Transformers.
By treating model parameters as tokens, we replace all the linear projections in Transformers.
Our model scales from 124M to 1.4B parameters by incrementally adding new key-value parameter pairs.
arXiv Detail & Related papers (2024-10-30T16:19:00Z) - SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - A Network Simulation of OTC Markets with Multiple Agents [3.8944986367855963]
We present a novel approach to simulating an over-the-counter (OTC) financial market in which trades are intermediated solely by market makers.
We show that our network-based model can lend insights into the effect of market-structure on price-action.
arXiv Detail & Related papers (2024-05-03T20:45:00Z) - Optimal Automated Market Makers: Differentiable Economics and Strong
Duality [22.943723387429678]
Optimal market making in the presence of multiple goods is not well understood.
We show that finding an optimal market maker is dual to an optimal transport problem.
We present conjectures of optimal mechanisms in settings which show further complex behavior.
arXiv Detail & Related papers (2024-02-14T12:27:54Z) - Electricity Price Forecasting in the Irish Balancing Market [0.0]
This work applies to the Irish balancing market a variety of price prediction techniques proven successful in the widely studied day-ahead market.
We compare statistical, machine learning, and deep learning models using a framework that investigates the impact of different training sizes.
An extensive numerical study shows that well-performing models in the day-ahead market do not perform well in the balancing one.
arXiv Detail & Related papers (2024-02-09T15:18:00Z) - An Auction-based Marketplace for Model Trading in Federated Learning [54.79736037670377]
Federated learning (FL) is increasingly recognized for its efficacy in training models using locally distributed data.
We frame FL as a marketplace of models, where clients act as both buyers and sellers.
We propose an auction-based solution to ensure proper pricing based on performance gain.
arXiv Detail & Related papers (2024-02-02T07:25:53Z) - Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective [106.92016199403042]
We empirically investigate knowledge transfer from larger to smaller models through a parametric perspective.
We employ sensitivity-based techniques to extract and align knowledge-specific parameters between different large language models.
Our findings highlight the critical factors contributing to the process of parametric knowledge transfer.
arXiv Detail & Related papers (2023-10-17T17:58:34Z) - HireVAE: An Online and Adaptive Factor Model Based on Hierarchical and
Regime-Switch VAE [113.47287249524008]
It is still an open question to build a factor model that can conduct stock prediction in an online and adaptive setting.
We propose the first deep learning based online and adaptive factor model, HireVAE, at the core of which is a hierarchical latent space that embeds the relationship between the market situation and stock-wise latent factors.
Across four commonly used real stock market benchmarks, the proposed HireVAE demonstrate superior performance in terms of active returns over previous methods.
arXiv Detail & Related papers (2023-06-05T12:58:13Z) - A Scalable Inference Method For Large Dynamic Economic Systems [19.757929782329892]
We present a novel Variational Bayesian Inference approach to incorporate a time-varying parameter auto-regressive model.
Our model is applied to a large blockchain dataset containing prices, transactions of individual actors, analyzing transactional flows and price movements.
We further improve the simple state-space modelling by introducing non-linearities in the forward model with the help of machine learning architectures.
arXiv Detail & Related papers (2021-10-27T10:52:17Z) - Exploring Sparse Expert Models and Beyond [51.90860155810848]
Mixture-of-Experts (MoE) models can achieve promising results with outrageous large amount of parameters but constant computation cost.
We propose a simple method called expert prototyping that splits experts into different prototypes and applies $k$ top-$1$ routing.
This strategy improves the model quality but maintains constant computational costs, and our further exploration on extremely large-scale models reflects that it is more effective in training larger models.
arXiv Detail & Related papers (2021-05-31T16:12:44Z) - Deep Probabilistic Modelling of Price Movements for High-Frequency
Trading [0.0]
We propose a deep recurrent architecture for the probabilistic modelling of high-frequency market prices.
The resulting deep mixture models simultaneously address several practical challenges important in the development of automated high-frequency trading strategies.
We show that our model outperforms the benchmark models in both a metric-based test and in a simulated trading scenario.
arXiv Detail & Related papers (2020-03-31T19:25:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.