ModelDiff: A Framework for Comparing Learning Algorithms
- URL: http://arxiv.org/abs/2211.12491v1
- Date: Tue, 22 Nov 2022 18:56:52 GMT
- Title: ModelDiff: A Framework for Comparing Learning Algorithms
- Authors: Harshay Shah, Sung Min Park, Andrew Ilyas, Aleksander Madry
- Abstract summary: We study the problem of (learning) algorithm comparison, where the goal is to find differences between models trained with two different learning algorithms.
We present ModelDiff, a method that leverages the datamodels framework to compare learning algorithms based on how they use their training data.
- Score: 86.19580801269036
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of (learning) algorithm comparison, where the goal is to
find differences between models trained with two different learning algorithms.
We begin by formalizing this goal as one of finding distinguishing feature
transformations, i.e., input transformations that change the predictions of
models trained with one learning algorithm but not the other. We then present
ModelDiff, a method that leverages the datamodels framework (Ilyas et al.,
2022) to compare learning algorithms based on how they use their training data.
We demonstrate ModelDiff through three case studies, comparing models trained
with/without data augmentation, with/without pre-training, and with different
SGD hyperparameters. Our code is available at
https://github.com/MadryLab/modeldiff .
Related papers
- MUSO: Achieving Exact Machine Unlearning in Over-Parameterized Regimes [19.664090734076712]
Machine unlearning (MU) makes a well-trained model behave as if it had never been trained on specific data.
We propose an alternating optimization algorithm that unifies the tasks of unlearning and relabeling.
The algorithm's effectiveness, confirmed through numerical experiments, highlights its superior performance in unlearning across various scenarios.
arXiv Detail & Related papers (2024-10-11T06:17:17Z) - Interpretable Differencing of Machine Learning Models [20.99877540751412]
We formalize the problem of model differencing as one of predicting a dissimilarity function of two ML models' outputs.
A Joint Surrogate Tree (JST) is composed of two conjoined decision tree surrogates for the two models.
A JST provides an intuitive representation of differences and places the changes in the context of the models' decision logic.
arXiv Detail & Related papers (2023-06-10T16:15:55Z) - CodeGen2: Lessons for Training LLMs on Programming and Natural Languages [116.74407069443895]
We unify encoder and decoder-based models into a single prefix-LM.
For learning methods, we explore the claim of a "free lunch" hypothesis.
For data distributions, the effect of a mixture distribution and multi-epoch training of programming and natural languages on model performance is explored.
arXiv Detail & Related papers (2023-05-03T17:55:25Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Tree-Based Adaptive Model Learning [62.997667081978825]
We extend the Kearns-Vazirani learning algorithm to handle systems that change over time.
We present a new learning algorithm that can reuse and update previously learned behavior, implement it in the LearnLib library, and evaluate it on large examples.
arXiv Detail & Related papers (2022-08-31T21:24:22Z) - Dynamically-Scaled Deep Canonical Correlation Analysis [77.34726150561087]
Canonical Correlation Analysis (CCA) is a method for feature extraction of two views by finding maximally correlated linear projections of them.
We introduce a novel dynamic scaling method for training an input-dependent canonical correlation model.
arXiv Detail & Related papers (2022-03-23T12:52:49Z) - Merging Models with Fisher-Weighted Averaging [24.698591753644077]
We introduce a fundamentally different method for transferring knowledge across models that amounts to "merging" multiple models into one.
Our approach effectively involves computing a weighted average of the models' parameters.
We show that our merging procedure makes it possible to combine models in previously unexplored ways.
arXiv Detail & Related papers (2021-11-18T17:59:35Z) - ModelDiff: Testing-Based DNN Similarity Comparison for Model Reuse
Detection [9.106864924968251]
ModelDiff is a testing-based approach to deep learning model similarity comparison.
A study on mobile deep learning apps has shown the feasibility of ModelDiff on real-world models.
arXiv Detail & Related papers (2021-06-11T15:16:18Z) - Learning Gaussian Graphical Models via Multiplicative Weights [54.252053139374205]
We adapt an algorithm of Klivans and Meka based on the method of multiplicative weight updates.
The algorithm enjoys a sample complexity bound that is qualitatively similar to others in the literature.
It has a low runtime $O(mp2)$ in the case of $m$ samples and $p$ nodes, and can trivially be implemented in an online manner.
arXiv Detail & Related papers (2020-02-20T10:50:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.