Fine-Tuning Games: Bargaining and Adaptation for General-Purpose Models
- URL: http://arxiv.org/abs/2308.04399v2
- Date: Fri, 11 Aug 2023 20:39:23 GMT
- Title: Fine-Tuning Games: Bargaining and Adaptation for General-Purpose Models
- Authors: Benjamin Laufer and Jon Kleinberg and Hoda Heidari
- Abstract summary: Major advances in Machine Learning (ML) and Artificial Intelligence (AI) increasingly take the form of developing and releasing general-purpose models.
This paper offers a model of the fine-tuning process where a Generalist brings the technological product to a certain level of performance, and one or more Domain-specialist(s) adapts it for use in a particular domain.
Both entities are profit-seeking and incur costs when they invest in the technology, and they must reach a bargaining agreement on how to share the revenue for the technology to reach the market.
- Score: 10.36010442870647
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Major advances in Machine Learning (ML) and Artificial Intelligence (AI)
increasingly take the form of developing and releasing general-purpose models.
These models are designed to be adapted by other businesses and agencies to
perform a particular, domain-specific function. This process has become known
as adaptation or fine-tuning. This paper offers a model of the fine-tuning
process where a Generalist brings the technological product (here an ML model)
to a certain level of performance, and one or more Domain-specialist(s) adapts
it for use in a particular domain. Both entities are profit-seeking and incur
costs when they invest in the technology, and they must reach a bargaining
agreement on how to share the revenue for the technology to reach the market.
For a relatively general class of cost and revenue functions, we characterize
the conditions under which the fine-tuning game yields a profit-sharing
solution. We observe that any potential domain-specialization will either
contribute, free-ride, or abstain in their uptake of the technology, and we
provide conditions yielding these different strategies. We show how methods
based on bargaining solutions and sub-game perfect equilibria provide insights
into the strategic behavior of firms in these types of interactions, and we
find that profit-sharing can still arise even when one firm has significantly
higher costs than another. We also provide methods for identifying
Pareto-optimal bargaining arrangements for a general set of utility functions.
Related papers
- Explainable Artificial Intelligence for identifying profitability predictors in Financial Statements [0.7067443325368975]
We apply Machine Learning techniques to raw financial statements data taken from AIDA, a Database comprising Italian listed companies' data from 2013 to 2022.
We present a comparative study of different models and following the European AI regulations, we complement our analysis by applying explainability techniques to the proposed models.
arXiv Detail & Related papers (2025-01-29T14:33:23Z) - Adaptive$^2$: Adaptive Domain Mining for Fine-grained Domain Adaptation Modeling [50.85199749890184]
We propose Adaptive$2$, a novel framework that learns domains adaptively using a domain mining module by self-supervision.
Results show that traditional domain adaptation methods with hand-crafted domains perform no better than single-domain models under fair FLOPS conditions.
Adaptive$2$ is the first approach to automatically learn both domain identification and adaptation in online advertising.
arXiv Detail & Related papers (2024-12-11T08:41:41Z) - Pricing and Competition for Generative AI [3.8677478583601776]
We explore the problem of how developers of new generative AI software can release and price their technology.
We first develop a comparison of two different models for a specific task with respect to user cost-effectiveness.
We then model the pricing problem of generative AI software as a game between two different companies.
arXiv Detail & Related papers (2024-11-04T22:52:45Z) - Collaborative AI in Sentiment Analysis: System Architecture, Data Prediction and Deployment Strategies [3.3374611485861116]
Large language model (LLM) based artificial intelligence technologies have been a game-changer, particularly in sentiment analysis.
However, integrating diverse AI models for processing complex multimodal data and the associated high costs of feature extraction presents significant challenges.
This study introduces a collaborative AI framework designed to efficiently distribute and resolve tasks across various AI systems.
arXiv Detail & Related papers (2024-10-17T06:14:34Z) - Enhancing Q&A with Domain-Specific Fine-Tuning and Iterative Reasoning: A Comparative Study [0.3932300766934226]
This paper investigates the impact of domain-specific model fine-tuning and of reasoning mechanisms on the performance of question-answering (Q&A) systems powered by large language models (LLMs) and Retrieval-Augmented Generation (RAG)
Using the FinanceBench SEC financial filings dataset, we observe that, for RAG, combining a fine-tuned embedding model with a fine-tuned LLM achieves better accuracy than generic models.
We propose a structured technical design space capturing major technical components of Q&A AI, and provide recommendations for making high-impact technical choices for such components.
arXiv Detail & Related papers (2024-04-17T23:00:03Z) - LMaaS: Exploring Pricing Strategy of Large Model as a Service for
Communication [11.337245234301857]
We argue that a pay-as-you-go service mode will be suitable in this context, referred to as Large Model as a Service (LM)
We propose an Iterative Model Pricing (IMP) algorithm that optimize the prices of large models iteratively by reasoning customers' future rental decisions.
In the second step, we optimize customers' selection decisions by designing a robust selecting and renting algorithm.
arXiv Detail & Related papers (2024-01-05T07:19:19Z) - Refined Mechanism Design for Approximately Structured Priors via Active
Regression [50.71772232237571]
We consider the problem of a revenue-maximizing seller with a large number of items for sale to $n$ strategic bidders.
It is well-known that optimal and even approximately-optimal mechanisms for this setting are notoriously difficult to characterize or compute.
arXiv Detail & Related papers (2023-10-11T20:34:17Z) - General Purpose Artificial Intelligence Systems (GPAIS): Properties,
Definition, Taxonomy, Societal Implications and Responsible Governance [16.030931070783637]
General-Purpose Artificial Intelligence Systems (GPAIS) has been defined to refer to these AI systems.
To date, the possibility of an Artificial General Intelligence, powerful enough to perform any intellectual task as if it were human, or even improve it, has remained an aspiration, fiction, and considered a risk for our society.
This work discusses existing definitions for GPAIS and proposes a new definition that allows for a gradual differentiation among types of GPAIS according to their properties and limitations.
arXiv Detail & Related papers (2023-07-26T16:35:48Z) - Incentive Mechanism Design for Unbiased Federated Learning with
Randomized Client Participation [31.2017942327673]
This paper proposes a game theoretic incentive mechanism for federated learning (FL) with randomized client participation.
We show that our mechanism achieves higher model performance for the server as well as higher profits for the clients.
arXiv Detail & Related papers (2023-04-17T04:05:57Z) - MERMAIDE: Learning to Align Learners using Model-Based Meta-Learning [62.065503126104126]
We study how a principal can efficiently and effectively intervene on the rewards of a previously unseen learning agent in order to induce desirable outcomes.
This is relevant to many real-world settings like auctions or taxation, where the principal may not know the learning behavior nor the rewards of real people.
We introduce MERMAIDE, a model-based meta-learning framework to train a principal that can quickly adapt to out-of-distribution agents.
arXiv Detail & Related papers (2023-04-10T15:44:50Z) - AI for IT Operations (AIOps) on Cloud Platforms: Reviews, Opportunities
and Challenges [60.56413461109281]
Artificial Intelligence for IT operations (AIOps) aims to combine the power of AI with the big data generated by IT Operations processes.
We discuss in depth the key types of data emitted by IT Operations activities, the scale and challenges in analyzing them, and where they can be helpful.
We categorize the key AIOps tasks as - incident detection, failure prediction, root cause analysis and automated actions.
arXiv Detail & Related papers (2023-04-10T15:38:12Z) - Towards Multi-Agent Reinforcement Learning driven Over-The-Counter
Market Simulations [16.48389671789281]
We study a game between liquidity provider and liquidity taker agents interacting in an over-the-counter market.
By playing against each other, our deep-reinforcement-learning-driven agents learn emergent behaviors.
We show convergence rates for our multi-agent policy gradient algorithm under a transitivity assumption.
arXiv Detail & Related papers (2022-10-13T17:06:08Z) - Slimmable Domain Adaptation [112.19652651687402]
We introduce a simple framework, Slimmable Domain Adaptation, to improve cross-domain generalization with a weight-sharing model bank.
Our framework surpasses other competing approaches by a very large margin on multiple benchmarks.
arXiv Detail & Related papers (2022-06-14T06:28:04Z) - Finding General Equilibria in Many-Agent Economic Simulations Using Deep
Reinforcement Learning [72.23843557783533]
We show that deep reinforcement learning can discover stable solutions that are epsilon-Nash equilibria for a meta-game over agent types.
Our approach is more flexible and does not need unrealistic assumptions, e.g., market clearing.
We demonstrate our approach in real-business-cycle models, a representative family of DGE models, with 100 worker-consumers, 10 firms, and a government who taxes and redistributes.
arXiv Detail & Related papers (2022-01-03T17:00:17Z) - Collective eXplainable AI: Explaining Cooperative Strategies and Agent
Contribution in Multiagent Reinforcement Learning with Shapley Values [68.8204255655161]
This study proposes a novel approach to explain cooperative strategies in multiagent RL using Shapley values.
Results could have implications for non-discriminatory decision making, ethical and responsible AI-derived decisions or policy making under fairness constraints.
arXiv Detail & Related papers (2021-10-04T10:28:57Z) - Decision Rule Elicitation for Domain Adaptation [93.02675868486932]
Human-in-the-loop machine learning is widely used in artificial intelligence (AI) to elicit labels from experts.
In this work, we allow experts to additionally produce decision rules describing their decision-making.
We show that decision rule elicitation improves domain adaptation of the algorithm and helps to propagate expert's knowledge to the AI model.
arXiv Detail & Related papers (2021-02-23T08:07:22Z) - Portfolio Optimization with 2D Relative-Attentional Gated Transformer [9.541129630971689]
We propose a novel Deterministic Policy Gradient with 2D Relative-attentional Gated Transformer (DPGRGT) model.
Applying learnable relative positional embeddings for the time and assets axes, the model better understands the peculiar structure of the financial data.
In our experiment using U.S. stock market data of 20 years, our model outperformed baseline models and demonstrated its effectiveness.
arXiv Detail & Related papers (2020-12-27T14:08:26Z) - Decentralized Reinforcement Learning: Global Decision-Making via Local
Economic Transactions [80.49176924360499]
We establish a framework for directing a society of simple, specialized, self-interested agents to solve sequential decision problems.
We derive a class of decentralized reinforcement learning algorithms.
We demonstrate the potential advantages of a society's inherent modular structure for more efficient transfer learning.
arXiv Detail & Related papers (2020-07-05T16:41:09Z) - VCG Mechanism Design with Unknown Agent Values under Stochastic Bandit
Feedback [104.06766271716774]
We study a multi-round welfare-maximising mechanism design problem in instances where agents do not know their values.
We first define three notions of regret for the welfare, the individual utilities of each agent and that of the mechanism.
Our framework also provides flexibility to control the pricing scheme so as to trade-off between the agent and seller regrets.
arXiv Detail & Related papers (2020-04-19T18:00:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.