Related papers: Simplifying Knowledge Transfer in Pretrained Models

Simplifying Knowledge Transfer in Pretrained Models

URL: http://arxiv.org/abs/2510.22208v1
Date: Sat, 25 Oct 2025 08:18:41 GMT
Title: Simplifying Knowledge Transfer in Pretrained Models
Authors: Siddharth Jain, Shyamgopal Karthik, Vineet Gandhi,
Abstract summary: We propose to leverage large publicly available model repositories as an auxiliary source of model improvements.<n>We introduce a data partitioning strategy where pretrained models autonomously adopt either the role of a student, seeking knowledge, or that of a teacher, imparting knowledge.
Score: 15.328214419664748
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Pretrained models are ubiquitous in the current deep learning landscape, offering strong results on a broad range of tasks. Recent works have shown that models differing in various design choices exhibit categorically diverse generalization behavior, resulting in one model grasping distinct data-specific insights unavailable to the other. In this paper, we propose to leverage large publicly available model repositories as an auxiliary source of model improvements. We introduce a data partitioning strategy where pretrained models autonomously adopt either the role of a student, seeking knowledge, or that of a teacher, imparting knowledge. Experiments across various tasks demonstrate the effectiveness of our proposed approach. In image classification, we improved the performance of ViT-B by approximately 1.4% through bidirectional knowledge transfer with ViT-T. For semantic segmentation, our method boosted all evaluation metrics by enabling knowledge transfer both within and across backbone architectures. In video saliency prediction, our approach achieved a new state-of-the-art. We further extend our approach to knowledge transfer between multiple models, leading to considerable performance improvements for all model participants.

Related papers

SAIL-Embedding Technical Report: Omni-modal Embedding Foundation Model [49.65930977591188]
Multimodal embedding models aim to yield informative unified representations that empower diverse cross-modal tasks.<n>We introduce SAIL-Embedding, an omni-modal embedding foundation model that addresses these issues through tailored training strategies and architectural design.<n>Specifically, the content-aware progressive training aims to enhance the model's adaptability to diverse downstream tasks and master enriched cross-modal proficiency.<n>The collaboration-aware recommendation enhancement training further adapts multimodal representations for recommendation scenarios by distilling knowledge from sequence-to-item and ID-to-item embeddings.
arXiv Detail & Related papers (2025-10-14T16:43:22Z)
UNIFORM: Unifying Knowledge from Large-scale and Diverse Pre-trained Models [62.76435672183968]
We introduce a novel framework, namely UNIFORM, for knowledge transfer from a diverse set of off-the-shelf models into one student model.<n>We propose a dedicated voting mechanism to capture the consensus of knowledge both at the logit level and at the feature level.<n>Experiments demonstrate that UNIFORM effectively enhances unsupervised object recognition performance compared to strong knowledge transfer baselines.
arXiv Detail & Related papers (2025-08-27T00:56:11Z)
Seeing Further on the Shoulders of Giants: Knowledge Inheritance for Vision Foundation Models [54.517276878748305]
Vision foundation models (VFMs) are predominantly developed using data-centric methods.<n>Many open-source vision models have been pretrained on domain-specific data.<n>We present a new model-driven approach for training VFMs through joint knowledge transfer and preservation.
arXiv Detail & Related papers (2025-08-20T13:30:23Z)
Incrementally Learning Multiple Diverse Data Domains via Multi-Source Dynamic Expansion Model [16.035374682124846]
Continual Learning seeks to develop a model capable of incrementally assimilating new information while retaining prior knowledge.<n>This paper shifts focus to a more complex and realistic learning environment, characterized by data samples sourced from multiple distinct domains.
arXiv Detail & Related papers (2025-01-15T15:49:46Z)
An Active Learning Framework for Inclusive Generation by Large Language Models [32.16984263644299]
Large Language Models (LLMs) generate text representative of diverse sub-populations.<n>We propose a novel clustering-based active learning framework, enhanced with knowledge distillation.<n>We construct two new datasets in tandem with model training, showing a performance improvement of 2%-10% over baseline models.
arXiv Detail & Related papers (2024-10-17T15:09:35Z)
Has Your Pretrained Model Improved? A Multi-head Posterior Based Approach [25.927323251675386]
We leverage the meta-features associated with each entity as a source of worldly knowledge and employ entity representations from the models. We propose using the consistency between these representations and the meta-features as a metric for evaluating pre-trained models. Our method's effectiveness is demonstrated across various domains, including models with relational datasets, large language models and image models.
arXiv Detail & Related papers (2024-01-02T17:08:26Z)
Harnessing Diffusion Models for Visual Perception with Meta Prompts [68.78938846041767]
We propose a simple yet effective scheme to harness a diffusion model for visual perception tasks. We introduce learnable embeddings (meta prompts) to the pre-trained diffusion models to extract proper features for perception. Our approach achieves new performance records in depth estimation tasks on NYU depth V2 and KITTI, and in semantic segmentation task on CityScapes.
arXiv Detail & Related papers (2023-12-22T14:40:55Z)
Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other. We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z)
Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models [89.44031286278347]
We propose a Hub-Pathway framework to enable knowledge transfer from a model hub. The proposed framework can be trained end-to-end with the target task-specific loss. Experiment results on computer vision and reinforcement learning tasks demonstrate that the framework achieves the state-of-the-art performance.
arXiv Detail & Related papers (2022-06-08T08:00:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.