Meta Input: How to Leverage Off-the-Shelf Deep Neural Networks
- URL: http://arxiv.org/abs/2210.13186v1
- Date: Fri, 21 Oct 2022 02:11:38 GMT
- Title: Meta Input: How to Leverage Off-the-Shelf Deep Neural Networks
- Authors: Minsu Kim, Youngjoon Yu, Sungjune Park, Yong Man Ro
- Abstract summary: We introduce a novel approach that allows end-users to exploit pretrained DNN models in their own testing environment without modifying the models.
We present a textitmeta input which is an additional input transforming the distribution of testing data to be aligned with that of training data.
As a result, end-users can exploit well-trained models in their own testing environment which can differ from the training environment.
- Score: 29.975937981538664
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: These days, although deep neural networks (DNNs) have achieved a noticeable
progress in a wide range of research area, it lacks the adaptability to be
employed in the real-world applications because of the environment discrepancy
problem. Such a problem originates from the difference between training and
testing environments, and it is widely known that it causes serious performance
degradation, when a pretrained DNN model is applied to a new testing
environment. Therefore, in this paper, we introduce a novel approach that
allows end-users to exploit pretrained DNN models in their own testing
environment without modifying the models. To this end, we present a
\textit{meta input} which is an additional input transforming the distribution
of testing data to be aligned with that of training data. The proposed meta
input can be optimized with a small number of testing data only by considering
the relation between testing input data and its output prediction. Also, it
does not require any knowledge of the network's internal architecture and
modification of its weight parameters. Then, the obtained meta input is added
to testing data in order to shift the distribution of testing data to that of
originally used training data. As a result, end-users can exploit well-trained
models in their own testing environment which can differ from the training
environment. We validate the effectiveness and versatility of the proposed meta
input by showing the robustness against the environment discrepancy through the
comprehensive experiments with various tasks.
Related papers
- Architecture, Dataset and Model-Scale Agnostic Data-free Meta-Learning [119.70303730341938]
We propose ePisode cUrriculum inveRsion (ECI) during data-free meta training and invErsion calibRation following inner loop (ICFIL) during meta testing.
ECI adaptively increases the difficulty level of pseudo episodes according to the real-time feedback of the meta model.
We formulate the optimization process of meta training with ECI as an adversarial form in an end-to-end manner.
arXiv Detail & Related papers (2023-03-20T15:10:41Z) - Adversarial Learning Networks: Source-free Unsupervised Domain
Incremental Learning [0.0]
In a non-stationary environment, updating a DNN model requires parameter re-training or model fine-tuning.
We propose an unsupervised source-free method to update DNN classification models.
Unlike existing methods, our approach can update a DNN model incrementally for non-stationary source and target tasks without storing past training data.
arXiv Detail & Related papers (2023-01-28T02:16:13Z) - Boosted Dynamic Neural Networks [53.559833501288146]
A typical EDNN has multiple prediction heads at different layers of the network backbone.
To optimize the model, these prediction heads together with the network backbone are trained on every batch of training data.
Treating training and testing inputs differently at the two phases will cause the mismatch between training and testing data distributions.
We formulate an EDNN as an additive model inspired by gradient boosting, and propose multiple training techniques to optimize the model effectively.
arXiv Detail & Related papers (2022-11-30T04:23:12Z) - A Penalty Approach for Normalizing Feature Distributions to Build
Confounder-Free Models [11.818509522227565]
MetaData Normalization (MDN) estimates the linear relationship between the metadata and each feature based on a non-trainable closed-form solution.
We extend the MDN method by applying a Penalty approach (referred to as PDMN)
We show improvement in model accuracy and greater independence from confounders using PMDN over MDN in a synthetic experiment and a multi-label, multi-site dataset of magnetic resonance images (MRIs)
arXiv Detail & Related papers (2022-07-11T04:02:12Z) - CAFA: Class-Aware Feature Alignment for Test-Time Adaptation [50.26963784271912]
Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time.
We propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously encourages a model to learn target representations in a class-discriminative manner.
arXiv Detail & Related papers (2022-06-01T03:02:07Z) - MEMO: Test Time Robustness via Adaptation and Augmentation [131.28104376280197]
We study the problem of test time robustification, i.e., using the test input to improve model robustness.
Recent prior works have proposed methods for test time adaptation, however, they each introduce additional assumptions.
We propose a simple approach that can be used in any test setting where the model is probabilistic and adaptable.
arXiv Detail & Related papers (2021-10-18T17:55:11Z) - Generalizing Neural Networks by Reflecting Deviating Data in Production [15.498447555957773]
We present a runtime approach that mitigates DNN mis-predictions caused by unexpected runtime inputs to the DNN.
We use a distribution analyzer based on the distance metric learned by a Siamese network to identify "unseen" semantically-preserving inputs.
Our approach transforms those unexpected inputs into inputs from the training set that are identified as having similar semantics.
arXiv Detail & Related papers (2021-10-06T13:05:45Z) - EARLIN: Early Out-of-Distribution Detection for Resource-efficient
Collaborative Inference [4.826988182025783]
Collaborative inference enables resource-constrained edge devices to make inferences by uploading inputs to a server.
While this setup works cost-effectively for successful inferences, it severely underperforms when the model faces input samples on which the model was not trained.
We propose a novel lightweight OOD detection approach that mines important features from the shallow layers of a pretrained CNN model.
arXiv Detail & Related papers (2021-06-25T18:43:23Z) - ALT-MAS: A Data-Efficient Framework for Active Testing of Machine
Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data.
The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.