Related papers: Match & Choose: Model Selection Framework for Fine-tuning Text-to-Image Diffusion Models

Match & Choose: Model Selection Framework for Fine-tuning Text-to-Image Diffusion Models

URL: http://arxiv.org/abs/2508.10993v1
Date: Thu, 14 Aug 2025 18:00:50 GMT
Title: Match & Choose: Model Selection Framework for Fine-tuning Text-to-Image Diffusion Models
Authors: Basile Lewandowski, Robert Birke, Lydia Y. Chen,
Abstract summary: Text-to-image (T2I) models are often pretrained on large corpora, and openly shared on a model platform, such as HuggingFace.<n>Users can then build up AI applications, e.g., generating media contents, by adopting pretrained T2I models and fine-tuning them on the target dataset.<n>We propose the first model selection framework, M&C, which enables users to efficiently choose a pretrained T2I model from a model platform without exhaustively fine-tuning them all on the target dataset.
Score: 8.509633852396062
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Text-to-image (T2I) models based on diffusion and transformer architectures advance rapidly. They are often pretrained on large corpora, and openly shared on a model platform, such as HuggingFace. Users can then build up AI applications, e.g., generating media contents, by adopting pretrained T2I models and fine-tuning them on the target dataset. While public pretrained T2I models facilitate the democratization of the models, users face a new challenge: which model can be best fine-tuned based on the target data domain? Model selection is well addressed in classification tasks, but little is known in (pretrained) T2I models and their performance indication on the target domain. In this paper, we propose the first model selection framework, M&C, which enables users to efficiently choose a pretrained T2I model from a model platform without exhaustively fine-tuning them all on the target dataset. The core of M&C is a matching graph, which consists of: (i) nodes of available models and profiled datasets, and (ii) edges of model-data and data-data pairs capturing the fine-tuning performance and data similarity, respectively. We then build a model that, based on the inputs of model/data feature, and, critically, the graph embedding feature, extracted from the matching graph, predicts the model achieving the best quality after fine-tuning for the target domain. We evaluate M&C on choosing across ten T2I models for 32 datasets against three baselines. Our results show that M&C successfully predicts the best model for fine-tuning in 61.3% of the cases and a closely performing model for the rest.

Related papers

Consensus-Driven Active Model Selection [29.150990754584978]
We propose a method for active model selection using predictions from candidate models to prioritize the labeling of test data points.<n>Our method, CODA, performs consensus-driven active model selection by modeling relationships between categories, and data points.<n>We validate our approach by curating a collection of 26 benchmark tasks capturing a range of model selection scenarios.
arXiv Detail & Related papers (2025-07-31T17:56:28Z)
Holmes: Towards Effective and Harmless Model Ownership Verification to Personalized Large Vision Models via Decoupling Common Features [54.63343151319368]
This paper proposes a harmless model ownership verification method for personalized models by decoupling similar common features.<n>In the first stage, we create shadow models that retain common features of the victim model while disrupting dataset-specific features.<n>After that, a meta-classifier is trained to identify stolen models by determining whether suspicious models contain the dataset-specific features of the victim.
arXiv Detail & Related papers (2025-06-24T15:40:11Z)
Approximating Language Model Training Data from Weights [70.08614275061689]
We formalize the problem of data approximation from model weights and propose several baselines and metrics.<n>We develop a gradient-based approach that selects the highest-matching data from a large public text corpus.<n>Even when none of the true training data is known, our method is able to locate a small subset of public Web documents.
arXiv Detail & Related papers (2025-06-18T15:26:43Z)
Anyprefer: An Agentic Framework for Preference Data Synthesis [62.3856754548222]
We propose Anyprefer, a framework designed to synthesize high-quality preference data for aligning the target model.<n> external tools are introduced to assist the judge model in accurately rewarding the target model's responses.<n>The synthesized data is compiled into a new preference dataset, Anyprefer-V1, consisting of 58K high-quality preference pairs.
arXiv Detail & Related papers (2025-04-27T15:21:59Z)
Model Selection with Model Zoo via Graph Learning [45.30615308692713]
We introduce TransferGraph, a novel framework that reformulates model selection as a graph learning problem. We demonstrate TransferGraph's effectiveness in capturing essential model-dataset relationships, yielding up to a 32% improvement in correlation between predicted performance and the actual fine-tuning results compared to the state-of-the-art methods.
arXiv Detail & Related papers (2024-04-05T09:50:00Z)
Transfer Learning with Point Transformers [3.678615604632945]
Point Transformers are state-of-the-art models for classification, segmentation, and detection on Point Cloud data. We explore two things: classification performance of these attention based networks on ModelNet10 dataset and then, we use the trained model to classify 3D MNIST dataset after finetuning.
arXiv Detail & Related papers (2024-04-01T01:23:58Z)
A Two-Phase Recall-and-Select Framework for Fast Model Selection [13.385915962994806]
We propose a two-phase (coarse-recall and fine-selection) model selection framework. It aims to enhance the efficiency of selecting a robust model by leveraging the models' training performances on benchmark datasets. It has been demonstrated that the proposed methodology facilitates the selection of a high-performing model at a rate about 3x times faster than conventional baseline methods.
arXiv Detail & Related papers (2024-03-28T14:44:44Z)
DsDm: Model-Aware Dataset Selection with Datamodels [81.01744199870043]
Standard practice is to filter for examples that match human notions of data quality. We find that selecting according to similarity with "high quality" data sources may not increase (and can even hurt) performance compared to randomly selecting data. Our framework avoids handpicked notions of data quality, and instead models explicitly how the learning process uses train datapoints to predict on the target tasks.
arXiv Detail & Related papers (2024-01-23T17:22:00Z)
Do the Frankenstein, or how to achieve better out-of-distribution performance with manifold mixing model soup [1.0878040851637998]
We show that the fused model gives significantly better out-of-distribution performance when finetuning a CLIP model for image classification. It provides also better accuracy on the original dataset where the finetuning has been done.
arXiv Detail & Related papers (2023-08-28T06:13:32Z)
Knowledge is a Region in Weight Space for Fine-tuned Language Models [48.589822853418404]
We study how the weight space and the underlying loss landscape of different models are interconnected. We show that language models that have been finetuned on the same dataset form a tight cluster in the weight space, while models finetuned on different datasets from the same underlying task form a looser cluster.
arXiv Detail & Related papers (2023-02-09T18:59:18Z)
Dataless Knowledge Fusion by Merging Weights of Language Models [47.432215933099016]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.<n>This creates a barrier to fusing knowledge across individual models to yield a better single model.<n>We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z)
A linearized framework and a new benchmark for model selection for fine-tuning [112.20527122513668]
Fine-tuning from a collection of models pre-trained on different domains is emerging as a technique to improve test accuracy in the low-data regime. We introduce two new baselines for model selection -- Label-Gradient and Label-Feature Correlation. Our benchmark highlights accuracy gain with model zoo compared to fine-tuning Imagenet models.
arXiv Detail & Related papers (2021-01-29T21:57:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.