Related papers: Learning Model Representations Using Publicly Available Model Hubs

Learning Model Representations Using Publicly Available Model Hubs

URL: http://arxiv.org/abs/2510.02096v1
Date: Thu, 02 Oct 2025 15:04:31 GMT
Title: Learning Model Representations Using Publicly Available Model Hubs
Authors: Damian Falk, Konstantin Schürholt, Konstantinos Tzevelekakis, Léo Meynent, Damian Borth,
Abstract summary: We propose a new weight space backbone designed to handle unstructured model populations.<n>We demonstrate that weight space representations trained on models from Hugging Face achieve strong performance.<n>We show that high-quality weight space representations can be learned in the wild.
Score: 10.787107620883946
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The weights of neural networks have emerged as a novel data modality, giving rise to the field of weight space learning. A central challenge in this area is that learning meaningful representations of weights typically requires large, carefully constructed collections of trained models, typically referred to as model zoos. These model zoos are often trained ad-hoc, requiring large computational resources, constraining the learned weight space representations in scale and flexibility. In this work, we drop this requirement by training a weight space learning backbone on arbitrary models downloaded from large, unstructured model repositories such as Hugging Face. Unlike curated model zoos, these repositories contain highly heterogeneous models: they vary in architecture and dataset, and are largely undocumented. To address the methodological challenges posed by such heterogeneity, we propose a new weight space backbone designed to handle unstructured model populations. We demonstrate that weight space representations trained on models from Hugging Face achieve strong performance, often outperforming backbones trained on laboratory-generated model zoos. Finally, we show that the diversity of the model weights in our training set allows our weight space model to generalize to unseen data modalities. By demonstrating that high-quality weight space representations can be learned in the wild, we show that curated model zoos are not indispensable, thereby overcoming a strong limitation currently faced by the weight space learning community.

Related papers

LimiX: Unleashing Structured-Data Modeling Capability for Generalist Intelligence [61.46575527504109]
LimiX-16M and LimiX-2M treat structured data as a joint distribution over variables and missingness.<n>We evaluate LimiX models across 11 large structured-data benchmarks with broad regimes of sample size, feature dimensionality, class number, categorical-to-numerical feature ratio, missingness, and sample-to-feature ratios.
arXiv Detail & Related papers (2025-09-03T17:39:08Z)
Evolution without Large Models: Training Language Model with Task Principles [52.44569608690695]
A common training approach for language models involves using a large-scale language model to expand a human-provided dataset.<n>This method significantly reduces training costs by eliminating the need for extensive human data annotation.<n>However, it still faces challenges such as high carbon emissions during data augmentation and the risk of data leakage.
arXiv Detail & Related papers (2025-07-08T13:52:45Z)
GRAM: A Generative Foundation Reward Model for Reward Generalization [48.63394690265176]
We develop a generative reward model that is first trained via large-scale unsupervised learning and then fine-tuned via supervised learning.<n>This model generalizes well across several tasks, including response ranking, reinforcement learning from human feedback, and task adaptation with fine-tuning.
arXiv Detail & Related papers (2025-06-17T04:34:27Z)
A Model Zoo of Vision Transformers [6.926413609535758]
We introduce the first model zoo of vision transformers (ViT)<n>To better represent recent training approaches, we develop a new blueprint for model zoo generation that encompasses both pre-training and fine-tuning steps.<n>They are carefully generated with a large span of generating factors, and their diversity is validated using a thorough choice of weight-space and behavioral metrics.
arXiv Detail & Related papers (2025-04-14T13:52:26Z)
The Impact of Model Zoo Size and Composition on Weight Space Learning [8.11780615053558]
Re-using trained neural network models is a common strategy to reduce training cost and transfer knowledge.<n>Weight space learning is a promising new field to re-use populations of pre-trained models for future tasks.<n>We propose a modification to a common weight space learning method to accommodate training on heterogeneous populations of models.
arXiv Detail & Related papers (2025-04-14T11:54:06Z)
Data-efficient Large Vision Models through Sequential Autoregression [58.26179273091461]
We develop an efficient, autoregression-based vision model on a limited dataset. We demonstrate how this model achieves proficiency in a spectrum of visual tasks spanning both high-level and low-level semantic understanding. Our empirical evaluations underscore the model's agility in adapting to various tasks, heralding a significant reduction in the parameter footprint.
arXiv Detail & Related papers (2024-02-07T13:41:53Z)
Learning the 3D Fauna of the Web [70.01196719128912]
We develop 3D-Fauna, an approach that learns a pan-category deformable 3D animal model for more than 100 animal species jointly. One crucial bottleneck of modeling animals is the limited availability of training data. We show that prior category-specific attempts fail to generalize to rare species with limited training images.
arXiv Detail & Related papers (2024-01-04T18:32:48Z)
Model Zoos: A Dataset of Diverse Populations of Neural Network Models [2.7167743929103363]
We publish a novel dataset of model zoos containing systematically generated and diverse populations of neural network models. The dataset can be found at www.modelzoos.cc.
arXiv Detail & Related papers (2022-09-29T13:20:42Z)
Hyper-Representations as Generative Models: Sampling Unseen Neural Network Weights [2.9678808525128813]
We extend hyper-representations for generative use to sample new model weights. Our results indicate the potential of knowledge aggregation from model zoos to new models via hyper-representations.
arXiv Detail & Related papers (2022-09-29T12:53:58Z)
Hyper-Representations for Pre-Training and Transfer Learning [2.9678808525128813]
We extend hyper-representations for generative use to sample new model weights as pre-training. Our results indicate the potential of knowledge aggregation from model zoos to new models via hyper-representations.
arXiv Detail & Related papers (2022-07-22T09:01:21Z)
S2RMs: Spatially Structured Recurrent Modules [105.0377129434636]
We take a step towards exploiting dynamic structure that are capable of simultaneously exploiting both modular andtemporal structures. We find our models to be robust to the number of available views and better capable of generalization to novel tasks without additional training.
arXiv Detail & Related papers (2020-07-13T17:44:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.