Related papers: MGit: A Model Versioning and Management System

MGit: A Model Versioning and Management System

URL: http://arxiv.org/abs/2307.07507v1
Date: Fri, 14 Jul 2023 17:56:48 GMT
Title: MGit: A Model Versioning and Management System
Authors: Wei Hao and Daniel Mendoza and Rafael da Silva and Deepak Narayanan and Amar Phanishaye
Abstract summary: MGit is a model versioning and management system that makes it easier to store, test, update, and collaborate on model derivatives. MGit is able to reduce the lineage graph's storage footprint by up to 7x and automatically update downstream models in response to updates to upstream models.
Score: 7.2678752235785735
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Models derived from other models are extremely common in machine learning (ML) today. For example, transfer learning is used to create task-specific models from "pre-trained" models through finetuning. This has led to an ecosystem where models are related to each other, sharing structure and often even parameter values. However, it is hard to manage these model derivatives: the storage overhead of storing all derived models quickly becomes onerous, prompting users to get rid of intermediate models that might be useful for further analysis. Additionally, undesired behaviors in models are hard to track down (e.g., is a bug inherited from an upstream model?). In this paper, we propose a model versioning and management system called MGit that makes it easier to store, test, update, and collaborate on model derivatives. MGit introduces a lineage graph that records provenance and versioning information between models, optimizations to efficiently store model parameters, as well as abstractions over this lineage graph that facilitate relevant testing, updating and collaboration functionality. MGit is able to reduce the lineage graph's storage footprint by up to 7x and automatically update downstream models in response to updates to upstream models.

Related papers

Exploring Model Kinship for Merging Large Language Models [52.01652098827454]
We introduce model kinship, the degree of similarity or relatedness between Large Language Models. We find that there is a certain relationship between model kinship and the performance gains after model merging. We propose a new model merging strategy: Top-k Greedy Merging with Model Kinship, which can yield better performance on benchmark datasets.
arXiv Detail & Related papers (2024-10-16T14:29:29Z)
What Matters for Model Merging at Scale? [94.26607564817786]
Model merging aims to combine multiple expert models into a more capable single model. Previous studies have primarily focused on merging a few small models. This study systematically evaluates the utility of model merging at scale.
arXiv Detail & Related papers (2024-10-04T17:17:19Z)
MUSCLE: A Model Update Strategy for Compatible LLM Evolution [29.032461144831053]
Large Language Models (LLMs) are regularly updated to enhance performance. Instance-level degradation (instance regression) of performance from one model version to the next can interfere with a user's mental model of the capabilities of a particular language model. We propose a training strategy to minimize the extent of instance regression in model updates.
arXiv Detail & Related papers (2024-07-12T17:12:48Z)
EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods. EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z)
Foundational GPT Model for MEG [3.524869467682149]
We propose two classes of deep learning foundational models that can be trained using forecasting of unlabelled brain signals. First, we consider a modified Wavenet; and second, we consider a modified Transformer-based (GPT2) model. We compare the performance of these deep learning models with standard linear autoregressive (AR) modelling on MEG data.
arXiv Detail & Related papers (2024-04-14T13:48:24Z)
Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data [49.73114504515852]
We show that replacing the original real data by each generation's synthetic data does indeed tend towards model collapse. We demonstrate that accumulating the successive generations of synthetic data alongside the original real data avoids model collapse.
arXiv Detail & Related papers (2024-04-01T18:31:24Z)
Induced Model Matching: Restricted Models Help Train Full-Featured Models [1.4963011898406866]
We consider scenarios where a very accurate (often small) predictive model using restricted features is available when training a full-featured (often larger) model. How can the restricted model be useful to the full model? We introduce a methodology called Induced Model Matching (IMM) IMM aligns the context-restricted, or induced, version of the large model with the restricted model.
arXiv Detail & Related papers (2024-02-19T20:21:09Z)
Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models. This creates a barrier to fusing knowledge across individual models to yield a better single model. We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z)
Self-Updating Models with Error Remediation [0.5156484100374059]
We propose a framework, Self-Updating Models with Error Remediation (SUMER), in which a deployed model updates itself as new data becomes available. A key component of SUMER is the notion of error remediation as self-labeled data can be susceptible to the propagation of errors. We find that self-updating models (SUMs) generally perform better than models that do not attempt to self-update when presented with additional previously-unseen data.
arXiv Detail & Related papers (2020-05-19T23:09:38Z)
When Ensembling Smaller Models is More Efficient than Single Large Models [52.38997176317532]
We show that ensembles can outperform single models with both higher accuracy and requiring fewer total FLOPs to compute. This presents an interesting observation that output diversity in ensembling can often be more efficient than training larger models.
arXiv Detail & Related papers (2020-05-01T18:56:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.