Embracing the Dark Knowledge: Domain Generalization Using Regularized
Knowledge Distillation
- URL: http://arxiv.org/abs/2107.02629v1
- Date: Tue, 6 Jul 2021 14:08:54 GMT
- Title: Embracing the Dark Knowledge: Domain Generalization Using Regularized
Knowledge Distillation
- Authors: Yufei Wang, Haoliang Li, Lap-pui Chau, Alex C. Kot
- Abstract summary: Lack of generalization capability in the absence of sufficient and representative data is one of the challenges that hinder their practical application.
We propose a simple, effective, and plug-and-play training strategy named Knowledge Distillation for Domain Generalization (KDDG)
We find that both the richer dark knowledge" from the teacher network, as well as the gradient filter we proposed, can reduce the difficulty of learning the mapping.
- Score: 65.79387438988554
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Though convolutional neural networks are widely used in different tasks, lack
of generalization capability in the absence of sufficient and representative
data is one of the challenges that hinder their practical application. In this
paper, we propose a simple, effective, and plug-and-play training strategy
named Knowledge Distillation for Domain Generalization (KDDG) which is built
upon a knowledge distillation framework with the gradient filter as a novel
regularization term. We find that both the ``richer dark knowledge" from the
teacher network, as well as the gradient filter we proposed, can reduce the
difficulty of learning the mapping which further improves the generalization
ability of the model. We also conduct experiments extensively to show that our
framework can significantly improve the generalization capability of deep
neural networks in different tasks including image classification,
segmentation, reinforcement learning by comparing our method with existing
state-of-the-art domain generalization techniques. Last but not the least, we
propose to adopt two metrics to analyze our proposed method in order to better
understand how our proposed method benefits the generalization capability of
deep neural networks.
Related papers
- Component-based Sketching for Deep ReLU Nets [55.404661149594375]
We develop a sketching scheme based on deep net components for various tasks.
We transform deep net training into a linear empirical risk minimization problem.
We show that the proposed component-based sketching provides almost optimal rates in approximating saturated functions.
arXiv Detail & Related papers (2024-09-21T15:30:43Z) - Towards Improving Robustness Against Common Corruptions using Mixture of
Class Specific Experts [10.27974860479791]
This paper introduces a novel paradigm known as the Mixture of Class-Specific Expert Architecture.
The proposed architecture aims to mitigate vulnerabilities associated with common neural network structures.
arXiv Detail & Related papers (2023-11-16T20:09:47Z) - Manifold Regularization for Memory-Efficient Training of Deep Neural
Networks [18.554311679277212]
We propose a framework for achieving improved memory efficiency in the process of learning traditional neural networks.
Use of the framework results in improved absolute performance and empirical generalization error relative to traditional learning techniques.
arXiv Detail & Related papers (2023-05-26T17:40:15Z) - TANGOS: Regularizing Tabular Neural Networks through Gradient
Orthogonalization and Specialization [69.80141512683254]
We introduce Tabular Neural Gradient Orthogonalization and gradient (TANGOS)
TANGOS is a novel framework for regularization in the tabular setting built on latent unit attributions.
We demonstrate that our approach can lead to improved out-of-sample generalization performance, outperforming other popular regularization methods.
arXiv Detail & Related papers (2023-03-09T18:57:13Z) - Low-light Image Enhancement by Retinex Based Algorithm Unrolling and
Adjustment [50.13230641857892]
We propose a new deep learning framework for the low-light image enhancement (LIE) problem.
The proposed framework contains a decomposition network inspired by algorithm unrolling, and adjustment networks considering both global brightness and local brightness sensitivity.
Experiments on a series of typical LIE datasets demonstrated the effectiveness of the proposed method, both quantitatively and visually, as compared with existing methods.
arXiv Detail & Related papers (2022-02-12T03:59:38Z) - Explainability-aided Domain Generalization for Image Classification [0.0]
We show that applying methods and architectures from the explainability literature can achieve state-of-the-art performance for the challenging task of domain generalization.
We develop a set of novel algorithms including DivCAM, an approach where the network receives guidance during training via gradient based class activation maps to focus on a diverse set of discriminative features.
Since these methods offer competitive performance on top of explainability, we argue that the proposed methods can be used as a tool to improve the robustness of deep neural network architectures.
arXiv Detail & Related papers (2021-04-05T02:27:01Z) - Sparsity Aware Normalization for GANs [32.76828505875087]
Generative adversarial networks (GANs) are known to benefit from regularization or normalization of their critic (discriminator) network during training.
In this paper, we analyze the popular spectral normalization scheme, find a significant drawback and introduce sparsity aware normalization (SAN), a new alternative approach for stabilizing GAN training.
arXiv Detail & Related papers (2021-03-03T15:05:18Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - Regularizing Meta-Learning via Gradient Dropout [102.29924160341572]
meta-learning models are prone to overfitting when there are no sufficient training tasks for the meta-learners to generalize.
We introduce a simple yet effective method to alleviate the risk of overfitting for gradient-based meta-learning.
arXiv Detail & Related papers (2020-04-13T10:47:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.