Distilling Universal and Joint Knowledge for Cross-Domain Model
Compression on Time Series Data
- URL: http://arxiv.org/abs/2307.03347v1
- Date: Fri, 7 Jul 2023 01:48:02 GMT
- Title: Distilling Universal and Joint Knowledge for Cross-Domain Model
Compression on Time Series Data
- Authors: Qing Xu, Min Wu, Xiaoli Li, Kezhi Mao, Zhenghua Chen
- Abstract summary: We propose a novel end-to-end framework called Universal and joint knowledge distillation (UNI-KD) for cross-domain model compression.
In particular, we propose to transfer both the universal feature-level knowledge across source and target domains and the joint logit-level knowledge shared by both domains from the teacher to the student model via an adversarial learning scheme.
- Score: 18.41222232863567
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For many real-world time series tasks, the computational complexity of
prevalent deep leaning models often hinders the deployment on resource-limited
environments (e.g., smartphones). Moreover, due to the inevitable domain shift
between model training (source) and deploying (target) stages, compressing
those deep models under cross-domain scenarios becomes more challenging.
Although some of existing works have already explored cross-domain knowledge
distillation for model compression, they are either biased to source data or
heavily tangled between source and target data. To this end, we design a novel
end-to-end framework called Universal and joint knowledge distillation (UNI-KD)
for cross-domain model compression. In particular, we propose to transfer both
the universal feature-level knowledge across source and target domains and the
joint logit-level knowledge shared by both domains from the teacher to the
student model via an adversarial learning scheme. More specifically, a
feature-domain discriminator is employed to align teacher's and student's
representations for universal knowledge transfer. A data-domain discriminator
is utilized to prioritize the domain-shared samples for joint knowledge
transfer. Extensive experimental results on four time series datasets
demonstrate the superiority of our proposed method over state-of-the-art (SOTA)
benchmarks.
Related papers
- Domain Expansion and Boundary Growth for Open-Set Single-Source Domain Generalization [70.02187124865627]
Open-set single-source domain generalization aims to use a single-source domain to learn a robust model that can be generalized to unknown target domains.
We propose a novel learning approach based on domain expansion and boundary growth to expand the scarce source samples.
Our approach can achieve significant improvements and reach state-of-the-art performance on several cross-domain image classification datasets.
arXiv Detail & Related papers (2024-11-05T09:08:46Z) - xTED: Cross-Domain Adaptation via Diffusion-Based Trajectory Editing [21.37585797507323]
Cross-domain policy transfer methods mostly aim at learning domain correspondences or corrections to facilitate policy learning.
We propose the Cross-Domain Trajectory EDiting framework that employs a specially designed diffusion model for cross-domain trajectory adaptation.
Our proposed model architecture effectively captures the intricate dependencies among states, actions, and rewards, as well as the dynamics patterns within target data.
arXiv Detail & Related papers (2024-09-13T10:07:28Z) - UDON: Universal Dynamic Online distillatioN for generic image representations [5.487134463783365]
Universal image representations are critical in enabling real-world fine-grained and instance-level recognition applications.
Existing methods fail to capture important domain-specific knowledge, while ignoring differences in data distribution across different domains.
We introduce a new learning technique, dubbed UDON (Universal Dynamic Online DistillatioN)
UDON employs multi-teacher distillation, where each teacher is specialized in one domain, to transfer detailed domain-specific knowledge into the student universal embedding.
arXiv Detail & Related papers (2024-06-12T15:36:30Z) - StyDeSty: Min-Max Stylization and Destylization for Single Domain Generalization [85.18995948334592]
Single domain generalization (single DG) aims at learning a robust model generalizable to unseen domains from only one training domain.
State-of-the-art approaches have mostly relied on data augmentations, such as adversarial perturbation and style enhancement, to synthesize new data.
We propose emphStyDeSty, which explicitly accounts for the alignment of the source and pseudo domains in the process of data augmentation.
arXiv Detail & Related papers (2024-06-01T02:41:34Z) - Federated Domain Generalization: A Survey [12.84261944926547]
In machine learning, data is often distributed across different devices, organizations, or edge nodes.
In response to this challenge, there has been a surge of interest in federated domain generalization.
This paper presents the first survey of recent advances in this area.
arXiv Detail & Related papers (2023-06-02T07:55:42Z) - TAL: Two-stream Adaptive Learning for Generalizable Person
Re-identification [115.31432027711202]
We argue that both domain-specific and domain-invariant features are crucial for improving the generalization ability of re-id models.
We name two-stream adaptive learning (TAL) to simultaneously model these two kinds of information.
Our framework can be applied to both single-source and multi-source domain generalization tasks.
arXiv Detail & Related papers (2021-11-29T01:27:42Z) - HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain
Language Model Compression [53.90578309960526]
Large pre-trained language models (PLMs) have shown overwhelming performances compared with traditional neural network methods.
We propose a hierarchical relational knowledge distillation (HRKD) method to capture both hierarchical and domain relational information.
arXiv Detail & Related papers (2021-10-16T11:23:02Z) - Generalizable Person Re-identification with Relevance-aware Mixture of
Experts [45.13716166680772]
We propose a novel method called the relevance-aware mixture of experts (RaMoE)
RaMoE uses an effective voting-based mixture mechanism to dynamically leverage source domains' diverse characteristics to improve the model's generalization.
Considering the target domains' invisibility during training, we propose a novel learning-to-learn algorithm combined with our relation alignment loss to update the voting network.
arXiv Detail & Related papers (2021-05-19T14:19:34Z) - FedDG: Federated Domain Generalization on Medical Image Segmentation via
Episodic Learning in Continuous Frequency Space [63.43592895652803]
Federated learning allows distributed medical institutions to collaboratively learn a shared prediction model with privacy protection.
While at clinical deployment, the models trained in federated learning can still suffer from performance drop when applied to completely unseen hospitals outside the federation.
We present a novel approach, named as Episodic Learning in Continuous Frequency Space (ELCFS), for this problem.
The effectiveness of our method is demonstrated with superior performance over state-of-the-arts and in-depth ablation experiments on two medical image segmentation tasks.
arXiv Detail & Related papers (2021-03-10T13:05:23Z) - Learning to Combine: Knowledge Aggregation for Multi-Source Domain
Adaptation [56.694330303488435]
We propose a Learning to Combine for Multi-Source Domain Adaptation (LtC-MSDA) framework.
In the nutshell, a knowledge graph is constructed on the prototypes of various domains to realize the information propagation among semantically adjacent representations.
Our approach outperforms existing methods with a remarkable margin.
arXiv Detail & Related papers (2020-07-17T07:52:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.