Related papers: Machine Learning Model Attribution Challenge

Machine Learning Model Attribution Challenge

URL: http://arxiv.org/abs/2302.06716v2
Date: Wed, 15 Feb 2023 14:43:59 GMT
Title: Machine Learning Model Attribution Challenge
Authors: Elizabeth Merkhofer, Deepesh Chaudhari, Hyrum S. Anderson, Keith Manville, Lily Wong, Jo\~ao Gante
Abstract summary: Fine-tuned machine learning models may derive from other trained models without obvious attribution characteristics. In this challenge, participants identify the publicly-available base models that underlie a set of anonymous, fine-tuned large language models.
Score: 2.6532805035238747
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present the findings of the Machine Learning Model Attribution Challenge https://mlmac.io. Fine-tuned machine learning models may derive from other trained models without obvious attribution characteristics. In this challenge, participants identify the publicly-available base models that underlie a set of anonymous, fine-tuned large language models (LLMs) using only textual output of the models. Contestants aim to correctly attribute the most fine-tuned models, with ties broken in the favor of contestants whose solutions use fewer calls to the fine-tuned models' API. The most successful approaches were manual, as participants observed similarities between model outputs and developed attribution heuristics based on public documentation of the base models, though several teams also submitted automated, statistical solutions.

Related papers

GRAM: A Generative Foundation Reward Model for Reward Generalization [48.63394690265176]
We develop a generative reward model that is first trained via large-scale unsupervised learning and then fine-tuned via supervised learning.<n>This model generalizes well across several tasks, including response ranking, reinforcement learning from human feedback, and task adaptation with fine-tuning.
arXiv Detail & Related papers (2025-06-17T04:34:27Z)
Enabling Small Models for Zero-Shot Classification through Model Label Learning [50.68074833512999]
We introduce a novel paradigm, Model Label Learning (MLL), which bridges the gap between models and their functionalities. Experiments on seven real-world datasets validate the effectiveness and efficiency of MLL.
arXiv Detail & Related papers (2024-08-21T09:08:26Z)
Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models. This creates a barrier to fusing knowledge across individual models to yield a better single model. We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z)
Foundation models in brief: A historical, socio-technical focus [2.5991265608180396]
Foundation models can be disruptive for future AI development by scaling up deep learning. Models achieve state-of-the-art performance on a variety of tasks in domains such as natural language processing and computer vision.
arXiv Detail & Related papers (2022-12-17T22:11:33Z)
Artificial Interrogation for Attributing Language Models [0.0]
The challenge provides twelve open-sourced base versions of popular language models and twelve fine-tuned language models for text generation. The goal of the contest is to identify which fine-tuned models originated from which base model. We have employed four distinct approaches for measuring the resemblance between the responses generated from the models of both sets.
arXiv Detail & Related papers (2022-11-20T05:46:29Z)
Investigating Ensemble Methods for Model Robustness Improvement of Text Classifiers [66.36045164286854]
We analyze a set of existing bias features and demonstrate there is no single model that works best for all the cases. By choosing an appropriate bias model, we can obtain a better robustness result than baselines with a more sophisticated model design.
arXiv Detail & Related papers (2022-10-28T17:52:10Z)
Composing Ensembles of Pre-trained Models via Iterative Consensus [95.10641301155232]
We propose a unified framework for composing ensembles of different pre-trained models. We use pre-trained models as "generators" or "scorers" and compose them via closed-loop iterative consensus optimization. We demonstrate that consensus achieved by an ensemble of scorers outperforms the feedback of a single scorer.
arXiv Detail & Related papers (2022-10-20T18:46:31Z)
Synthetic Model Combination: An Instance-wise Approach to Unsupervised Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data. Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z)
Revealing Secrets From Pre-trained Models [2.0249686991196123]
Transfer-learning has been widely adopted in many emerging deep learning algorithms. We show that pre-trained models and fine-tuned models have significantly high similarities in weight values. We propose a new model extraction attack that reveals the model architecture and the pre-trained model used by the black-box victim model.
arXiv Detail & Related papers (2022-07-19T20:19:03Z)
What do we expect from Multiple-choice QA Systems? [70.86513724662302]
We consider a top performing model on several Multiple Choice Question Answering (MCQA) datasets. We evaluate it against a set of expectations one might have from such a model, using a series of zero-information perturbations of the model's inputs.
arXiv Detail & Related papers (2020-11-20T21:27:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.