XTransfer: Modality-Agnostic Few-Shot Model Transfer for Human Sensing at the Edge
- URL: http://arxiv.org/abs/2506.22726v2
- Date: Sun, 28 Sep 2025 12:19:55 GMT
- Title: XTransfer: Modality-Agnostic Few-Shot Model Transfer for Human Sensing at the Edge
- Authors: Yu Zhang, Xi Zhang, Hualin zhou, Xinyuan Chen, Shang Gao, Hong Jia, Jianfei Yang, Yuankai Qi, Tao Gu,
- Abstract summary: XTransfer is a first-of-its-kind method enabling modality-agnostic, few-shot model transfer with resource-efficient design.<n>It achieves state-of-the-art performance while significantly reducing the costs of sensor data collection, model training, and edge deployment.
- Score: 45.430391851892274
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning for human sensing on edge systems presents significant potential for smart applications. However, its training and development are hindered by the limited availability of sensor data and resource constraints of edge systems. While transferring pre-trained models to different sensing applications is promising, existing methods often require extensive sensor data and computational resources, resulting in high costs and poor adaptability in practice. In this paper, we propose XTransfer, a first-of-its-kind method enabling modality-agnostic, few-shot model transfer with resource-efficient design. XTransfer flexibly uses single or multiple pre-trained models and transfers knowledge across different modalities by (i) model repairing that safely mitigates modality shift by adapting pre-trained layers with only few sensor data, and (ii) layer recombining that efficiently searches and recombines layers of interest from source models in a layer-wise manner to create compact models. We benchmark various baselines across diverse human sensing datasets spanning different modalities. Comprehensive results demonstrate that XTransfer achieves state-of-the-art performance while significantly reducing the costs of sensor data collection, model training, and edge deployment.
Related papers
- D-CAT: Decoupled Cross-Attention Transfer between Sensor Modalities for Unimodal Inference [3.6344649347926326]
Cross-modal transfer learning is used to improve multi-modal classification models.<n>Existing methods require paired sensor data at both training and inference.<n>We propose Decoupled Cross-Attention Transfer (D-CAT), a framework that aligns modality-specific representations without requiring joint sensor modality during inference.
arXiv Detail & Related papers (2025-09-11T10:54:07Z) - RDTF: Resource-efficient Dual-mask Training Framework for Multi-frame Animated Sticker Generation [29.340362062804967]
Under constrained resources, training a smaller video generation model from scratch can outperform parameter-efficient tuning on larger models in downstream applications.<n>We propose a difficulty-adaptive curriculum learning method, which decomposes the sample entropy into static and adaptive components.
arXiv Detail & Related papers (2025-03-22T11:28:25Z) - Transfer Learning of Surrogate Models via Domain Affine Transformation Across Synthetic and Real-World Benchmarks [4.515998639772672]
Surrogate models are frequently employed as efficient substitutes for the costly execution of real-world processes.<n>This study focuses on transferring non-differentiable surrogate models from a source function to a target function.<n>We assume their domains are related by an unknown affine transformation, using only a limited amount of transfer data points evaluated on the target.
arXiv Detail & Related papers (2025-01-23T18:44:25Z) - SMPLest-X: Ultimate Scaling for Expressive Human Pose and Shape Estimation [81.36747103102459]
Expressive human pose and shape estimation (EHPS) unifies body, hands, and face motion capture with numerous applications.<n>Current state-of-the-art methods focus on training innovative architectural designs on confined datasets.<n>We investigate the impact of scaling up EHPS towards a family of generalist foundation models.
arXiv Detail & Related papers (2025-01-16T18:59:46Z) - Transfer Learning on Multi-Dimensional Data: A Novel Approach to Neural Network-Based Surrogate Modeling [0.0]
Convolutional neural networks (CNNs) have gained popularity as the basis for such surrogate models.<n>We propose training a CNN surrogate model on a mixture of numerical solutions to both the $d$-dimensional problem and its ($d-1$)-dimensional approximation.<n>We demonstrate our approach on a multiphase flow test problem, using transfer learning to train a dense fully-convolutional encoder-decoder CNN on the two classes of data.
arXiv Detail & Related papers (2024-10-16T05:07:48Z) - Encapsulating Knowledge in One Prompt [56.31088116526825]
KiOP encapsulates knowledge from various models into a solitary prompt without altering the original models or requiring access to the training data.
From a practicality standpoint, this paradigm proves the effectiveness of Visual Prompt in data inaccessible contexts.
Experiments across various datasets and models demonstrate the efficacy of the proposed KiOP knowledge transfer paradigm.
arXiv Detail & Related papers (2024-07-16T16:35:23Z) - Rethinking Transformers Pre-training for Multi-Spectral Satellite
Imagery [78.43828998065071]
Recent advances in unsupervised learning have demonstrated the ability of large vision models to achieve promising results on downstream tasks.
Such pre-training techniques have also been explored recently in the remote sensing domain due to the availability of large amount of unlabelled data.
In this paper, we re-visit transformers pre-training and leverage multi-scale information that is effectively utilized with multiple modalities.
arXiv Detail & Related papers (2024-03-08T16:18:04Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - On the Transferability of Learning Models for Semantic Segmentation for
Remote Sensing Data [12.500746892824338]
Recent deep learning-based methods outperform traditional learning methods on remote sensing (RS) semantic segmentation/classification tasks.
Yet, there is no comprehensive analysis of their transferability, i.e., to which extent a model trained on a source domain can be readily applicable to a target domain.
This paper investigates the raw transferability of traditional and deep learning (DL) models, as well as the effectiveness of domain adaptation (DA) approaches.
arXiv Detail & Related papers (2023-10-16T15:13:36Z) - Informative Data Mining for One-Shot Cross-Domain Semantic Segmentation [84.82153655786183]
We propose a novel framework called Informative Data Mining (IDM) to enable efficient one-shot domain adaptation for semantic segmentation.
IDM provides an uncertainty-based selection criterion to identify the most informative samples, which facilitates quick adaptation and reduces redundant training.
Our approach outperforms existing methods and achieves a new state-of-the-art one-shot performance of 56.7%/55.4% on the GTA5/SYNTHIA to Cityscapes adaptation tasks.
arXiv Detail & Related papers (2023-09-25T15:56:01Z) - Towards a Better Theoretical Understanding of Independent Subnetwork Training [56.24689348875711]
We take a closer theoretical look at Independent Subnetwork Training (IST)
IST is a recently proposed and highly effective technique for solving the aforementioned problems.
We identify fundamental differences between IST and alternative approaches, such as distributed methods with compressed communication.
arXiv Detail & Related papers (2023-06-28T18:14:22Z) - Modality-invariant Visual Odometry for Embodied Vision [1.7188280334580197]
Visual Odometry (VO) is a practical substitute for unreliable GPS and compass sensors.
Recent deep VO models limit themselves to a fixed set of input modalities, e.g., RGB and depth, while training on millions of samples.
We propose a Transformer-based modality-invariant VO approach that can deal with diverse or changing sensor suites of navigation agents.
arXiv Detail & Related papers (2023-04-29T21:47:12Z) - Enhancing Multiple Reliability Measures via Nuisance-extended
Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition.
We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training.
We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.