Sequential Transfer Machine Learning in Networks: Measuring the Impact
of Data and Neural Net Similarity on Transferability
- URL: http://arxiv.org/abs/2003.13070v1
- Date: Sun, 29 Mar 2020 16:41:15 GMT
- Title: Sequential Transfer Machine Learning in Networks: Measuring the Impact
of Data and Neural Net Similarity on Transferability
- Authors: Robin Hirt, Akash Srivastava, Carlos Berg and Niklas K\"uhl
- Abstract summary: In networks of independent entities that face similar predictive tasks, transfer machine learning enables to re-use and improve neural nets.
We perform an empirical study on a real-world use case comprised of sales data from six different restaurants.
We calculate potential indicators for transferability based on divergences of data, data projections and a novel metric for neural net similarity.
- Score: 4.626261940793027
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In networks of independent entities that face similar predictive tasks,
transfer machine learning enables to re-use and improve neural nets using
distributed data sets without the exposure of raw data. As the number of data
sets in business networks grows and not every neural net transfer is
successful, indicators are needed for its impact on the target performance-its
transferability. We perform an empirical study on a unique real-world use case
comprised of sales data from six different restaurants. We train and transfer
neural nets across these restaurant sales data and measure their
transferability. Moreover, we calculate potential indicators for
transferability based on divergences of data, data projections and a novel
metric for neural net similarity. We obtain significant negative correlations
between the transferability and the tested indicators. Our findings allow to
choose the transfer path based on these indicators, which improves model
performance whilst simultaneously requiring fewer model transfers.
Related papers
- Features are fate: a theory of transfer learning in high-dimensional regression [23.840251319669907]
We show that when the target task is well represented by the feature space of the pre-trained model, transfer learning outperforms training from scratch.
For this model, we establish rigorously that when the feature space overlap between the source and target tasks is sufficiently strong, both linear transfer and fine-tuning improve performance.
arXiv Detail & Related papers (2024-10-10T17:58:26Z) - Model-Based Inference and Experimental Design for Interference Using Partial Network Data [4.76518127830168]
We present a framework for the estimation and inference of treatment effect adjustments using partial network data.
We illustrate procedures to assign treatments using only partial network data.
We validate our approach using simulated experiments on observed graphs with applications to information diffusion in India and Malawi.
arXiv Detail & Related papers (2024-06-17T17:27:18Z) - MAGDiff: Covariate Data Set Shift Detection via Activation Graphs of Deep Neural Networks [8.887179103071388]
We propose a new family of representations, called MAGDiff, that we extract from any given neural network classifier.
These representations are computed by comparing the activation graphs of the neural network for samples belonging to the training distribution and to the target distribution.
We show that our novel representations induce significant improvements over a state-of-the-art baseline relying on the network output.
arXiv Detail & Related papers (2023-05-22T17:34:47Z) - Decomposing neural networks as mappings of correlation functions [57.52754806616669]
We study the mapping between probability distributions implemented by a deep feed-forward network.
We identify essential statistics in the data, as well as different information representations that can be used by neural networks.
arXiv Detail & Related papers (2022-02-10T09:30:31Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - How Well Do Sparse Imagenet Models Transfer? [75.98123173154605]
Transfer learning is a classic paradigm by which models pretrained on large "upstream" datasets are adapted to yield good results on "downstream" datasets.
In this work, we perform an in-depth investigation of this phenomenon in the context of convolutional neural networks (CNNs) trained on the ImageNet dataset.
We show that sparse models can match or even outperform the transfer performance of dense models, even at high sparsities.
arXiv Detail & Related papers (2021-11-26T11:58:51Z) - Probing transfer learning with a model of synthetic correlated datasets [11.53207294639557]
Transfer learning can significantly improve the sample efficiency of neural networks.
We re-think a solvable model of synthetic data as a framework for modeling correlation between data-sets.
We show that our model can capture a range of salient features of transfer learning with real data.
arXiv Detail & Related papers (2021-06-09T22:15:41Z) - What is being transferred in transfer learning? [51.6991244438545]
We show that when training from pre-trained weights, the model stays in the same basin in the loss landscape.
We present that when training from pre-trained weights, the model stays in the same basin in the loss landscape and different instances of such model are similar in feature space and close in parameter space.
arXiv Detail & Related papers (2020-08-26T17:23:40Z) - On Robustness and Transferability of Convolutional Neural Networks [147.71743081671508]
Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts.
We study the interplay between out-of-distribution and transfer performance of modern image classification CNNs for the first time.
We find that increasing both the training set and model sizes significantly improve the distributional shift robustness.
arXiv Detail & Related papers (2020-07-16T18:39:04Z) - Understanding the Effects of Data Parallelism and Sparsity on Neural
Network Training [126.49572353148262]
We study two factors in neural network training: data parallelism and sparsity.
Despite their promising benefits, understanding of their effects on neural network training remains elusive.
arXiv Detail & Related papers (2020-03-25T10:49:22Z) - The Utility of Feature Reuse: Transfer Learning in Data-Starved Regimes [6.419457653976053]
We describe a transfer learning use case for a domain with a data-starved regime.
We evaluate the effectiveness of convolutional feature extraction and fine-tuning.
We conclude that transfer learning enhances the performance of CNN architectures in data-starved regimes.
arXiv Detail & Related papers (2020-02-29T18:48:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.