LCDB 1.1: A Database Illustrating Learning Curves Are More Ill-Behaved Than Previously Thought
- URL: http://arxiv.org/abs/2505.15657v1
- Date: Wed, 21 May 2025 15:32:42 GMT
- Title: LCDB 1.1: A Database Illustrating Learning Curves Are More Ill-Behaved Than Previously Thought
- Authors: Cheng Yan, Felix Mohr, Tom Viering,
- Abstract summary: We show that learning curves are less often well-behaved than previously thought.<n>Using statistically rigorous methods, we observe significant ill-behavior in approximately 14% of the learning curves.<n>We identify which learners are to blame and show that specific learners are more ill-behaved than others.
- Score: 11.282804463462165
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sample-wise learning curves plot performance versus training set size. They are useful for studying scaling laws and speeding up hyperparameter tuning and model selection. Learning curves are often assumed to be well-behaved: monotone (i.e. improving with more data) and convex. By constructing the Learning Curves Database 1.1 (LCDB 1.1), a large-scale database with high-resolution learning curves, we show that learning curves are less often well-behaved than previously thought. Using statistically rigorous methods, we observe significant ill-behavior in approximately 14% of the learning curves, almost twice as much as in previous estimates. We also identify which learners are to blame and show that specific learners are more ill-behaved than others. Additionally, we demonstrate that different feature scalings rarely resolve ill-behavior. We evaluate the impact of ill-behavior on downstream tasks, such as learning curve fitting and model selection, and find it poses significant challenges, underscoring the relevance and potential of LCDB 1.1 as a challenging benchmark for future research.
Related papers
- Learning What Matters: Prioritized Concept Learning via Relative Error-driven Sample Selection [38.35524024887503]
We propose PRioritized cOncept learninG via Relative Error-driven Sample Selection (PROGRESS)<n>PROGRESS is a data- and compute-efficient framework that enables vision-language models to dynamically select what to learn next.<n>We show that PROGRESS consistently outperforms state-of-the-art baselines with much less data and supervision.
arXiv Detail & Related papers (2025-06-01T17:05:35Z) - Unlocking the Potential of Difficulty Prior in RL-based Multimodal Reasoning [69.64809103333839]
We investigate how explicitly modeling problem's difficulty prior information shapes the effectiveness of reinforcement learning based fine-tuning for multimodal reasoning.<n>Our approach demonstrates significant performances across various multi-modal mathematical reasoning benchmarks with only 2K+0.6K two-stage training data.
arXiv Detail & Related papers (2025-05-19T15:43:10Z) - What Do Learning Dynamics Reveal About Generalization in LLM Reasoning? [83.83230167222852]
We find that a model's generalization behavior can be effectively characterized by a training metric we call pre-memorization train accuracy.
By connecting a model's learning behavior to its generalization, pre-memorization train accuracy can guide targeted improvements to training strategies.
arXiv Detail & Related papers (2024-11-12T09:52:40Z) - Enhancing Consistency and Mitigating Bias: A Data Replay Approach for Incremental Learning [93.90047628101155]
Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks.<n>To address this, some methods propose replaying data from previous tasks during new task learning.<n>However, it is not expected in practice due to memory constraints and data privacy issues.
arXiv Detail & Related papers (2024-01-12T12:51:12Z) - Towards Causal Deep Learning for Vulnerability Detection [31.59558109518435]
We introduce do calculus based causal learning to software engineering models.
Our results show that CausalVul consistently improved the model accuracy, robustness and OOD performance.
arXiv Detail & Related papers (2023-10-12T00:51:06Z) - Primal Dual Continual Learning: Balancing Stability and Plasticity through Adaptive Memory Allocation [86.8475564814154]
We show that it is both possible and beneficial to undertake the constrained optimization problem directly.
We focus on memory-based methods, where a small subset of samples from previous tasks can be stored in a replay buffer.
We show that dual variables indicate the sensitivity of the optimal value of the continual learning problem with respect to constraint perturbations.
arXiv Detail & Related papers (2023-09-29T21:23:27Z) - Time Series Contrastive Learning with Information-Aware Augmentations [57.45139904366001]
A key component of contrastive learning is to select appropriate augmentations imposing some priors to construct feasible positive samples.
How to find the desired augmentations of time series data that are meaningful for given contrastive learning tasks and datasets remains an open question.
We propose a new contrastive learning approach with information-aware augmentations, InfoTS, that adaptively selects optimal augmentations for time series representation learning.
arXiv Detail & Related papers (2023-03-21T15:02:50Z) - Estimation of Predictive Performance in High-Dimensional Data Settings
using Learning Curves [0.0]
Learn2Evaluate is based on learning curves by fitting a smooth monotone curve depicting test performance as a function of the sample size.
The benefits of Learn2Evaluate are illustrated by a simulation study and applications to omics data.
arXiv Detail & Related papers (2022-06-08T11:48:01Z) - Mind Your Outliers! Investigating the Negative Impact of Outliers on
Active Learning for Visual Question Answering [71.15403434929915]
We show that across 5 models and 4 datasets on the task of visual question answering, a wide variety of active learning approaches fail to outperform random selection.
We identify the problem as collective outliers -- groups of examples that active learning methods prefer to acquire but models fail to learn.
We show that active learning sample efficiency increases significantly as the number of collective outliers in the active learning pool decreases.
arXiv Detail & Related papers (2021-07-06T00:52:11Z) - The Shape of Learning Curves: a Review [14.764764847928259]
This review recounts the origins of the term, provides a formal definition of the learning curve, and briefly covers basics such as its estimation.
We discuss empirical and theoretical evidence that supports well-behaved curves that often have the shape of a power law or an exponential.
We draw specific attention to examples of learning curves that are ill-behaved, showing worse learning performance with more training data.
arXiv Detail & Related papers (2021-03-19T17:56:33Z) - Bayesian Meta-Prior Learning Using Empirical Bayes [3.666114237131823]
We propose a hierarchical Empirical Bayes approach that addresses the absence of informative priors, and the inability to control parameter learning rates.
Our method learns empirical meta-priors from the data itself and uses them to decouple the learning rates of first-order and second-order features.
Our findings are promising, as optimizing over sparse data is often a challenge.
arXiv Detail & Related papers (2020-02-04T05:08:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.