Related papers: Learning Curves for Analysis of Deep Networks

Learning Curves for Analysis of Deep Networks

URL: http://arxiv.org/abs/2010.11029v2
Date: Mon, 5 Apr 2021 17:01:02 GMT
Title: Learning Curves for Analysis of Deep Networks
Authors: Derek Hoiem, Tanmay Gupta, Zhizhong Li, Michal M. Shlapentokh-Rothman
Abstract summary: Learning curves can be used to select model parameters and extrapolate performance. We propose a method to robustly estimate learning curves, abstract their parameters into error and data-reliance, and evaluate the effectiveness of different parameterizations.
Score: 23.968036672913392
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning curves model a classifier's test error as a function of the number of training samples. Prior works show that learning curves can be used to select model parameters and extrapolate performance. We investigate how to use learning curves to evaluate design choices, such as pretraining, architecture, and data augmentation. We propose a method to robustly estimate learning curves, abstract their parameters into error and data-reliance, and evaluate the effectiveness of different parameterizations. Our experiments exemplify use of learning curves for analysis and yield several interesting observations.

Related papers

Capturing the Temporal Dependence of Training Data Influence [100.91355498124527]
We formalize the concept of trajectory-specific leave-one-out influence, which quantifies the impact of removing a data point during training. We propose data value embedding, a novel technique enabling efficient approximation of trajectory-specific LOO. As data value embedding captures training data ordering, it offers valuable insights into model training dynamics.
arXiv Detail & Related papers (2024-12-12T18:28:55Z)
Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning. Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset. We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU) We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z)
Strategies and impact of learning curve estimation for CNN-based image classification [0.2678472239880052]
Learning curves are a measure for how the performance of machine learning models improves given a certain volume of training data. Over a wide variety of applications and models it was observed that learning curves follow -- to a large extent -- a power law behavior. By estimating the learning curve of a model from training on small subsets of data only the best models need to be considered for training on the full dataset.
arXiv Detail & Related papers (2023-10-12T16:28:25Z)
Leveraging Angular Information Between Feature and Classifier for Long-tailed Learning: A Prediction Reformulation Approach [90.77858044524544]
We reformulate the recognition probabilities through included angles without re-balancing the classifier weights. Inspired by the performance improvement of the predictive form reformulation, we explore the different properties of this angular prediction. Our method is able to obtain the best performance among peer methods without pretraining on CIFAR10/100-LT and ImageNet-LT.
arXiv Detail & Related papers (2022-12-03T07:52:48Z)
Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters. We find that our approach successfully generates parameters for a wide range of loss prompts. We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z)
Estimation of Predictive Performance in High-Dimensional Data Settings using Learning Curves [0.0]
Learn2Evaluate is based on learning curves by fitting a smooth monotone curve depicting test performance as a function of the sample size. The benefits of Learn2Evaluate are illustrated by a simulation study and applications to omics data.
arXiv Detail & Related papers (2022-06-08T11:48:01Z)
Dynamically-Scaled Deep Canonical Correlation Analysis [77.34726150561087]
Canonical Correlation Analysis (CCA) is a method for feature extraction of two views by finding maximally correlated linear projections of them. We introduce a novel dynamic scaling method for training an input-dependent canonical correlation model.
arXiv Detail & Related papers (2022-03-23T12:52:49Z)
Learning Curves for Decision Making in Supervised Machine Learning -- A Survey [9.994200032442413]
Learning curves are a concept from social sciences that has been adopted in the context of machine learning. We contribute a framework that categorizes learning curve approaches using three criteria: the decision situation that they address, the intrinsic learning curve question that they answer and the type of resources that they use.
arXiv Detail & Related papers (2022-01-28T14:34:32Z)
Learning to Refit for Convex Learning Problems [11.464758257681197]
We propose a framework to learn to estimate optimized model parameters for different training sets using neural networks. We rigorously characterize the power of neural networks to approximate convex problems.
arXiv Detail & Related papers (2021-11-24T15:28:50Z)
Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning. Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z)
The Shape of Learning Curves: a Review [14.764764847928259]
This review recounts the origins of the term, provides a formal definition of the learning curve, and briefly covers basics such as its estimation. We discuss empirical and theoretical evidence that supports well-behaved curves that often have the shape of a power law or an exponential. We draw specific attention to examples of learning curves that are ill-behaved, showing worse learning performance with more training data.
arXiv Detail & Related papers (2021-03-19T17:56:33Z)
Distilling Interpretable Models into Human-Readable Code [71.11328360614479]
Human-readability is an important and desirable standard for machine-learned model interpretability. We propose to train interpretable models using conventional methods, and then distill them into concise, human-readable code. We describe a piecewise-linear curve-fitting algorithm that produces high-quality results efficiently and reliably across a broad range of use cases.
arXiv Detail & Related papers (2021-01-21T01:46:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.