Learning Curves for Analysis of Deep Networks
- URL: http://arxiv.org/abs/2010.11029v2
- Date: Mon, 5 Apr 2021 17:01:02 GMT
- Title: Learning Curves for Analysis of Deep Networks
- Authors: Derek Hoiem, Tanmay Gupta, Zhizhong Li, Michal M. Shlapentokh-Rothman
- Abstract summary: Learning curves can be used to select model parameters and extrapolate performance.
We propose a method to robustly estimate learning curves, abstract their parameters into error and data-reliance, and evaluate the effectiveness of different parameterizations.
- Score: 23.968036672913392
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning curves model a classifier's test error as a function of the number
of training samples. Prior works show that learning curves can be used to
select model parameters and extrapolate performance. We investigate how to use
learning curves to evaluate design choices, such as pretraining, architecture,
and data augmentation. We propose a method to robustly estimate learning
curves, abstract their parameters into error and data-reliance, and evaluate
the effectiveness of different parameterizations. Our experiments exemplify use
of learning curves for analysis and yield several interesting observations.
Related papers
- Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning
Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning.
Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset.
We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU)
We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z) - Strategies and impact of learning curve estimation for CNN-based image
classification [0.2678472239880052]
Learning curves are a measure for how the performance of machine learning models improves given a certain volume of training data.
Over a wide variety of applications and models it was observed that learning curves follow -- to a large extent -- a power law behavior.
By estimating the learning curve of a model from training on small subsets of data only the best models need to be considered for training on the full dataset.
arXiv Detail & Related papers (2023-10-12T16:28:25Z) - Leveraging Angular Information Between Feature and Classifier for
Long-tailed Learning: A Prediction Reformulation Approach [90.77858044524544]
We reformulate the recognition probabilities through included angles without re-balancing the classifier weights.
Inspired by the performance improvement of the predictive form reformulation, we explore the different properties of this angular prediction.
Our method is able to obtain the best performance among peer methods without pretraining on CIFAR10/100-LT and ImageNet-LT.
arXiv Detail & Related papers (2022-12-03T07:52:48Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Estimation of Predictive Performance in High-Dimensional Data Settings
using Learning Curves [0.0]
Learn2Evaluate is based on learning curves by fitting a smooth monotone curve depicting test performance as a function of the sample size.
The benefits of Learn2Evaluate are illustrated by a simulation study and applications to omics data.
arXiv Detail & Related papers (2022-06-08T11:48:01Z) - Dynamically-Scaled Deep Canonical Correlation Analysis [77.34726150561087]
Canonical Correlation Analysis (CCA) is a method for feature extraction of two views by finding maximally correlated linear projections of them.
We introduce a novel dynamic scaling method for training an input-dependent canonical correlation model.
arXiv Detail & Related papers (2022-03-23T12:52:49Z) - Learning Curves for Decision Making in Supervised Machine Learning -- A
Survey [9.994200032442413]
Learning curves are a concept from social sciences that has been adopted in the context of machine learning.
We contribute a framework that categorizes learning curve approaches using three criteria: the decision situation that they address, the intrinsic learning curve question that they answer and the type of resources that they use.
arXiv Detail & Related papers (2022-01-28T14:34:32Z) - Learning to Refit for Convex Learning Problems [11.464758257681197]
We propose a framework to learn to estimate optimized model parameters for different training sets using neural networks.
We rigorously characterize the power of neural networks to approximate convex problems.
arXiv Detail & Related papers (2021-11-24T15:28:50Z) - Towards Open-World Feature Extrapolation: An Inductive Graph Learning
Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning.
Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z) - The Shape of Learning Curves: a Review [14.764764847928259]
This review recounts the origins of the term, provides a formal definition of the learning curve, and briefly covers basics such as its estimation.
We discuss empirical and theoretical evidence that supports well-behaved curves that often have the shape of a power law or an exponential.
We draw specific attention to examples of learning curves that are ill-behaved, showing worse learning performance with more training data.
arXiv Detail & Related papers (2021-03-19T17:56:33Z) - Distilling Interpretable Models into Human-Readable Code [71.11328360614479]
Human-readability is an important and desirable standard for machine-learned model interpretability.
We propose to train interpretable models using conventional methods, and then distill them into concise, human-readable code.
We describe a piecewise-linear curve-fitting algorithm that produces high-quality results efficiently and reliably across a broad range of use cases.
arXiv Detail & Related papers (2021-01-21T01:46:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.