A Survey on Stability of Learning with Limited Labelled Data and its Sensitivity to the Effects of Randomness
- URL: http://arxiv.org/abs/2312.01082v2
- Date: Tue, 3 Sep 2024 07:59:58 GMT
- Title: A Survey on Stability of Learning with Limited Labelled Data and its Sensitivity to the Effects of Randomness
- Authors: Branislav Pecher, Ivan Srba, Maria Bielikova,
- Abstract summary: This survey provides a comprehensive overview of 415 papers addressing the effects of randomness on the stability of learning with limited labelled data.
We identify and discuss seven challenges and open problems together with possible directions to facilitate further research.
The ultimate goal of this survey is to emphasise the importance of this growing research area, which so far has not received an appropriate level of attention, and reveal impactful directions for future research.
- Score: 5.009377915313077
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning with limited labelled data, such as prompting, in-context learning, fine-tuning, meta-learning or few-shot learning, aims to effectively train a model using only a small amount of labelled samples. However, these approaches have been observed to be excessively sensitive to the effects of uncontrolled randomness caused by non-determinism in the training process. The randomness negatively affects the stability of the models, leading to large variances in results across training runs. When such sensitivity is disregarded, it can unintentionally, but unfortunately also intentionally, create an imaginary perception of research progress. Recently, this area started to attract research attention and the number of relevant studies is continuously growing. In this survey, we provide a comprehensive overview of 415 papers addressing the effects of randomness on the stability of learning with limited labelled data. We distinguish between four main tasks addressed in the papers (investigate/evaluate; determine; mitigate; benchmark/compare/report randomness effects), providing findings for each one. Furthermore, we identify and discuss seven challenges and open problems together with possible directions to facilitate further research. The ultimate goal of this survey is to emphasise the importance of this growing research area, which so far has not received an appropriate level of attention, and reveal impactful directions for future research.
Related papers
- Unsupervised Pairwise Causal Discovery on Heterogeneous Data using Mutual Information Measures [49.1574468325115]
Causal Discovery is a technique that tackles the challenge by analyzing the statistical properties of the constituent variables.
We question the current (possibly misleading) baseline results on the basis that they were obtained through supervised learning.
In consequence, we approach this problem in an unsupervised way, using robust Mutual Information measures.
arXiv Detail & Related papers (2024-08-01T09:11:08Z) - On Sensitivity of Learning with Limited Labelled Data to the Effects of Randomness: Impact of Interactions and Systematic Choices [5.009377915313077]
We propose a method to investigate the effects of randomness factors while taking the interactions into consideration.
To measure the true effects of an individual randomness factor, our method mitigates the effects of other factors and observes how the performance varies across multiple runs.
Applying our method to multiple randomness factors across in-context learning and fine-tuning approaches on 7 representative text classification tasks and meta-learning on 3 tasks, we show that: 1) disregarding interactions between randomness factors in existing works caused inconsistent findings due to incorrect attribution of the effects of randomness factors, such as disproving the consistent sensitivity of in-context learning to sample order even
arXiv Detail & Related papers (2024-02-20T08:38:19Z) - Assumption violations in causal discovery and the robustness of score matching [38.60630271550033]
This paper extensively benchmarks the empirical performance of recent causal discovery methods on observational i.i.d. data.
We show that score matching-based methods demonstrate surprising performance in the false positive and false negative rate of the inferred graph.
We hope this paper will set a new standard for the evaluation of causal discovery methods.
arXiv Detail & Related papers (2023-10-20T09:56:07Z) - Causal Discovery and Counterfactual Explanations for Personalized
Student Learning [0.0]
The study's main contributions include using causal discovery to identify causal predictors of student performance.
The results reveal the identified causal relationships, such as the influence of earlier test grades and mathematical ability on final student performance.
A major challenge remains, which is the real-time implementation and validation of counterfactual recommendations.
arXiv Detail & Related papers (2023-09-18T10:32:47Z) - Leveraging Unlabelled Data in Multiple-Instance Learning Problems for
Improved Detection of Parkinsonian Tremor in Free-Living Conditions [80.88681952022479]
We introduce a new method for combining semi-supervised with multiple-instance learning.
We show that by leveraging the unlabelled data of 454 subjects we can achieve large performance gains in per-subject tremor detection.
arXiv Detail & Related papers (2023-04-29T12:25:10Z) - Valid Inference After Causal Discovery [73.87055989355737]
We develop tools for valid post-causal-discovery inference.
We show that a naive combination of causal discovery and subsequent inference algorithms leads to highly inflated miscoverage rates.
arXiv Detail & Related papers (2022-08-11T17:40:45Z) - ALLSH: Active Learning Guided by Local Sensitivity and Hardness [98.61023158378407]
We propose to retrieve unlabeled samples with a local sensitivity and hardness-aware acquisition function.
Our method achieves consistent gains over the commonly used active learning strategies in various classification tasks.
arXiv Detail & Related papers (2022-05-10T15:39:11Z) - Partial Identification of Dose Responses with Hidden Confounders [25.468473751289036]
Inferring causal effects of continuous-valued treatments from observational data is a crucial task.
We present novel methodology to bound both average and conditional average continuous-valued treatment-effect estimates.
We apply our method to a real-world observational case study to demonstrate the value of identifying dose-dependent causal effects.
arXiv Detail & Related papers (2022-04-24T07:02:21Z) - SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event
Data [83.50281440043241]
We study the problem of inferring heterogeneous treatment effects from time-to-event data.
We propose a novel deep learning method for treatment-specific hazard estimation based on balancing representations.
arXiv Detail & Related papers (2021-10-26T20:13:17Z) - Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process.
Our method significantly reduces the required number of interactions compared with random intervention targeting.
We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z) - Bayesian Active Learning for Wearable Stress and Affect Detection [0.7106986689736827]
Stress detection using on-device deep learning algorithms has been on the rise owing to advancements in pervasive computing.
In this paper, we propose a framework with capabilities to represent model uncertainties through approximations in Bayesian Neural Networks.
Our proposed framework achieves a considerable efficiency boost during inference, with a substantially low number of acquired pool points.
arXiv Detail & Related papers (2020-12-04T16:19:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.