A Survey of Learning Curves with Bad Behavior: or How More Data Need Not
Lead to Better Performance
- URL: http://arxiv.org/abs/2211.14061v1
- Date: Fri, 25 Nov 2022 12:36:52 GMT
- Title: A Survey of Learning Curves with Bad Behavior: or How More Data Need Not
Lead to Better Performance
- Authors: Marco Loog and Tom Viering
- Abstract summary: Plotting a learner's generalization performance against a training set size results in a so-called learning curve.
We make the (ideal) learning curve concept precise and briefly discuss the aforementioned usages of such curves.
The larger part of this survey's focus is on learning curves that show that more data does not necessarily lead to better generalization performance.
- Score: 15.236871820889345
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Plotting a learner's generalization performance against the training set size
results in a so-called learning curve. This tool, providing insight in the
behavior of the learner, is also practically valuable for model selection,
predicting the effect of more training data, and reducing the computational
complexity of training. We set out to make the (ideal) learning curve concept
precise and briefly discuss the aforementioned usages of such curves. The
larger part of this survey's focus, however, is on learning curves that show
that more data does not necessarily leads to better generalization performance.
A result that seems surprising to many researchers in the field of artificial
intelligence. We point out the significance of these findings and conclude our
survey with an overview and discussion of open problems in this area that
warrant further theoretical and empirical investigation.
Related papers
- Learning to Abstain From Uninformative Data [20.132146513548843]
We study the problem of learning and acting under a general noisy generative process.
In this problem, the data distribution has a significant proportion of uninformative samples with high noise in the label.
We propose a novel approach to learning under these conditions via a loss inspired by the selective learning theory.
arXiv Detail & Related papers (2023-09-25T15:55:55Z) - An Expert's Guide to Training Physics-informed Neural Networks [5.198985210238479]
Physics-informed neural networks (PINNs) have been popularized as a deep learning framework.
PINNs can seamlessly synthesize observational data and partial differential equation (PDE) constraints.
We present a series of best practices that can significantly improve the training efficiency and overall accuracy of PINNs.
arXiv Detail & Related papers (2023-08-16T16:19:25Z) - When Do Curricula Work in Federated Learning? [56.88941905240137]
We find that curriculum learning largely alleviates non-IIDness.
The more disparate the data distributions across clients the more they benefit from learning.
We propose a novel client selection technique that benefits from the real-world disparity in the clients.
arXiv Detail & Related papers (2022-12-24T11:02:35Z) - Estimation of Predictive Performance in High-Dimensional Data Settings
using Learning Curves [0.0]
Learn2Evaluate is based on learning curves by fitting a smooth monotone curve depicting test performance as a function of the sample size.
The benefits of Learn2Evaluate are illustrated by a simulation study and applications to omics data.
arXiv Detail & Related papers (2022-06-08T11:48:01Z) - Improved Fine-tuning by Leveraging Pre-training Data: Theory and
Practice [52.11183787786718]
Fine-tuning a pre-trained model on the target data is widely used in many deep learning applications.
Recent studies have empirically shown that training from scratch has the final performance that is no worse than this pre-training strategy.
We propose a novel selection strategy to select a subset from pre-training data to help improve the generalization on the target task.
arXiv Detail & Related papers (2021-11-24T06:18:32Z) - Towards Open-World Feature Extrapolation: An Inductive Graph Learning
Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning.
Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z) - The Shape of Learning Curves: a Review [14.764764847928259]
This review recounts the origins of the term, provides a formal definition of the learning curve, and briefly covers basics such as its estimation.
We discuss empirical and theoretical evidence that supports well-behaved curves that often have the shape of a power law or an exponential.
We draw specific attention to examples of learning curves that are ill-behaved, showing worse learning performance with more training data.
arXiv Detail & Related papers (2021-03-19T17:56:33Z) - Accurate and Robust Feature Importance Estimation under Distribution
Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method.
We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z) - Provably Efficient Causal Reinforcement Learning with Confounded
Observational Data [135.64775986546505]
We study how to incorporate the dataset (observational data) collected offline, which is often abundantly available in practice, to improve the sample efficiency in the online setting.
We propose the deconfounded optimistic value iteration (DOVI) algorithm, which incorporates the confounded observational data in a provably efficient manner.
arXiv Detail & Related papers (2020-06-22T14:49:33Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.