Related papers: Meta Input: How to Leverage Off-the-Shelf Deep Neural Networks

Meta Input: How to Leverage Off-the-Shelf Deep Neural Networks

URL: http://arxiv.org/abs/2210.13186v1
Date: Fri, 21 Oct 2022 02:11:38 GMT
Title: Meta Input: How to Leverage Off-the-Shelf Deep Neural Networks
Authors: Minsu Kim, Youngjoon Yu, Sungjune Park, Yong Man Ro
Abstract summary: We introduce a novel approach that allows end-users to exploit pretrained DNN models in their own testing environment without modifying the models. We present a textitmeta input which is an additional input transforming the distribution of testing data to be aligned with that of training data. As a result, end-users can exploit well-trained models in their own testing environment which can differ from the training environment.
Score: 29.975937981538664
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: These days, although deep neural networks (DNNs) have achieved a noticeable progress in a wide range of research area, it lacks the adaptability to be employed in the real-world applications because of the environment discrepancy problem. Such a problem originates from the difference between training and testing environments, and it is widely known that it causes serious performance degradation, when a pretrained DNN model is applied to a new testing environment. Therefore, in this paper, we introduce a novel approach that allows end-users to exploit pretrained DNN models in their own testing environment without modifying the models. To this end, we present a \textit{meta input} which is an additional input transforming the distribution of testing data to be aligned with that of training data. The proposed meta input can be optimized with a small number of testing data only by considering the relation between testing input data and its output prediction. Also, it does not require any knowledge of the network's internal architecture and modification of its weight parameters. Then, the obtained meta input is added to testing data in order to shift the distribution of testing data to that of originally used training data. As a result, end-users can exploit well-trained models in their own testing environment which can differ from the training environment. We validate the effectiveness and versatility of the proposed meta input by showing the robustness against the environment discrepancy through the comprehensive experiments with various tasks.

Related papers

Architecture, Dataset and Model-Scale Agnostic Data-free Meta-Learning [119.70303730341938]
We propose ePisode cUrriculum inveRsion (ECI) during data-free meta training and invErsion calibRation following inner loop (ICFIL) during meta testing. ECI adaptively increases the difficulty level of pseudo episodes according to the real-time feedback of the meta model. We formulate the optimization process of meta training with ECI as an adversarial form in an end-to-end manner.
arXiv Detail & Related papers (2023-03-20T15:10:41Z)
Adversarial Learning Networks: Source-free Unsupervised Domain Incremental Learning [0.0]
In a non-stationary environment, updating a DNN model requires parameter re-training or model fine-tuning. We propose an unsupervised source-free method to update DNN classification models. Unlike existing methods, our approach can update a DNN model incrementally for non-stationary source and target tasks without storing past training data.
arXiv Detail & Related papers (2023-01-28T02:16:13Z)
Boosted Dynamic Neural Networks [53.559833501288146]
A typical EDNN has multiple prediction heads at different layers of the network backbone. To optimize the model, these prediction heads together with the network backbone are trained on every batch of training data. Treating training and testing inputs differently at the two phases will cause the mismatch between training and testing data distributions. We formulate an EDNN as an additive model inspired by gradient boosting, and propose multiple training techniques to optimize the model effectively.
arXiv Detail & Related papers (2022-11-30T04:23:12Z)
A Penalty Approach for Normalizing Feature Distributions to Build Confounder-Free Models [11.818509522227565]
MetaData Normalization (MDN) estimates the linear relationship between the metadata and each feature based on a non-trainable closed-form solution. We extend the MDN method by applying a Penalty approach (referred to as PDMN) We show improvement in model accuracy and greater independence from confounders using PMDN over MDN in a synthetic experiment and a multi-label, multi-site dataset of magnetic resonance images (MRIs)
arXiv Detail & Related papers (2022-07-11T04:02:12Z)
CAFA: Class-Aware Feature Alignment for Test-Time Adaptation [50.26963784271912]
Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time. We propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously encourages a model to learn target representations in a class-discriminative manner.
arXiv Detail & Related papers (2022-06-01T03:02:07Z)
MEMO: Test Time Robustness via Adaptation and Augmentation [131.28104376280197]
We study the problem of test time robustification, i.e., using the test input to improve model robustness. Recent prior works have proposed methods for test time adaptation, however, they each introduce additional assumptions. We propose a simple approach that can be used in any test setting where the model is probabilistic and adaptable.
arXiv Detail & Related papers (2021-10-18T17:55:11Z)
Generalizing Neural Networks by Reflecting Deviating Data in Production [15.498447555957773]
We present a runtime approach that mitigates DNN mis-predictions caused by unexpected runtime inputs to the DNN. We use a distribution analyzer based on the distance metric learned by a Siamese network to identify "unseen" semantically-preserving inputs. Our approach transforms those unexpected inputs into inputs from the training set that are identified as having similar semantics.
arXiv Detail & Related papers (2021-10-06T13:05:45Z)
EARLIN: Early Out-of-Distribution Detection for Resource-efficient Collaborative Inference [4.826988182025783]
Collaborative inference enables resource-constrained edge devices to make inferences by uploading inputs to a server. While this setup works cost-effectively for successful inferences, it severely underperforms when the model faces input samples on which the model was not trained. We propose a novel lightweight OOD detection approach that mines important features from the shallow layers of a pretrained CNN model.
arXiv Detail & Related papers (2021-06-25T18:43:23Z)
ALT-MAS: A Data-Efficient Framework for Active Testing of Machine Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data. The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z)
Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples. We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries. We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.