Absolute convergence and error thresholds in non-active adaptive
sampling
- URL: http://arxiv.org/abs/2402.02522v1
- Date: Sun, 4 Feb 2024 15:10:34 GMT
- Title: Absolute convergence and error thresholds in non-active adaptive
sampling
- Authors: Manuel Vilares Ferro, Victor M. Darriba Bilbao, Jes\'us Vilares Ferro
- Abstract summary: Non-active adaptive sampling is a way of building machine learning models from a training data base.
Proposal for calculating absolute convergence and error thresholds is described.
Tests meet our expectations and illustrate the proposal in the domain of natural language processing.
- Score: 0.27624021966289597
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Non-active adaptive sampling is a way of building machine learning models
from a training data base which are supposed to dynamically and automatically
derive guaranteed sample size. In this context and regardless of the strategy
used in both scheduling and generating of weak predictors, a proposal for
calculating absolute convergence and error thresholds is described. We not only
make it possible to establish when the quality of the model no longer
increases, but also supplies a proximity condition to estimate in absolute
terms how close it is to achieving such a goal, thus supporting decision making
for fine-tuning learning parameters in model selection. The technique proves
its correctness and completeness with respect to our working hypotheses, in
addition to strengthening the robustness of the sampling scheme. Tests meet our
expectations and illustrate the proposal in the domain of natural language
processing, taking the generation of part-of-speech taggers as case study.
Related papers
- Adaptive scheduling for adaptive sampling in POS taggers construction [0.27624021966289597]
We introduce an adaptive scheduling for adaptive sampling as a novel way of machine learning in the construction of part-of-speech taggers.
We analyze the shape of the learning curve geometrically in conjunction with a functional model to increase or decrease it at any time.
We also improve the robustness of sampling by paying greater attention to those regions of the training data base subject to a temporary inflation in performance.
arXiv Detail & Related papers (2024-02-04T15:02:17Z) - Source-Free Unsupervised Domain Adaptation with Hypothesis Consolidation
of Prediction Rationale [53.152460508207184]
Source-Free Unsupervised Domain Adaptation (SFUDA) is a challenging task where a model needs to be adapted to a new domain without access to target domain labels or source domain data.
This paper proposes a novel approach that considers multiple prediction hypotheses for each sample and investigates the rationale behind each hypothesis.
To achieve the optimal performance, we propose a three-step adaptation process: model pre-adaptation, hypothesis consolidation, and semi-supervised learning.
arXiv Detail & Related papers (2024-02-02T05:53:22Z) - Time-series Generation by Contrastive Imitation [87.51882102248395]
We study a generative framework that seeks to combine the strengths of both: Motivated by a moment-matching objective to mitigate compounding error, we optimize a local (but forward-looking) transition policy.
At inference, the learned policy serves as the generator for iterative sampling, and the learned energy serves as a trajectory-level measure for evaluating sample quality.
arXiv Detail & Related papers (2023-11-02T16:45:25Z) - Gradient and Uncertainty Enhanced Sequential Sampling for Global Fit [0.0]
This paper proposes a new sampling strategy for global fit called Gradient and Uncertainty Enhanced Sequential Sampling (GUESS)
We show that GUESS achieved on average the highest sample efficiency compared to other surrogate-based strategies on the tested examples.
arXiv Detail & Related papers (2023-09-29T19:49:39Z) - Improving Adaptive Conformal Prediction Using Self-Supervised Learning [72.2614468437919]
We train an auxiliary model with a self-supervised pretext task on top of an existing predictive model and use the self-supervised error as an additional feature to estimate nonconformity scores.
We empirically demonstrate the benefit of the additional information using both synthetic and real data on the efficiency (width), deficit, and excess of conformal prediction intervals.
arXiv Detail & Related papers (2023-02-23T18:57:14Z) - Deciding What to Model: Value-Equivalent Sampling for Reinforcement
Learning [21.931580762349096]
We introduce an algorithm that computes an approximately-value-equivalent, lossy compression of the environment which an agent may feasibly target in lieu of the true model.
We prove an information-theoretic, Bayesian regret bound for our algorithm that holds for any finite-horizon, episodic sequential decision-making problem.
arXiv Detail & Related papers (2022-06-04T23:36:38Z) - Self-Normalized Importance Sampling for Neural Language Modeling [97.96857871187052]
In this work, we propose self-normalized importance sampling. Compared to our previous work, the criteria considered in this work are self-normalized and there is no need to further conduct a correction step.
We show that our proposed self-normalized importance sampling is competitive in both research-oriented and production-oriented automatic speech recognition tasks.
arXiv Detail & Related papers (2021-11-11T16:57:53Z) - Calibrating Over-Parametrized Simulation Models: A Framework via
Eligibility Set [3.862247454265944]
We develop a framework to develop calibration schemes that satisfy rigorous frequentist statistical guarantees.
We demonstrate our methodology on several numerical examples, including an application to calibration of a limit order book market simulator.
arXiv Detail & Related papers (2021-05-27T00:59:29Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z) - Efficient Ensemble Model Generation for Uncertainty Estimation with
Bayesian Approximation in Segmentation [74.06904875527556]
We propose a generic and efficient segmentation framework to construct ensemble segmentation models.
In the proposed method, ensemble models can be efficiently generated by using the layer selection method.
We also devise a new pixel-wise uncertainty loss, which improves the predictive performance.
arXiv Detail & Related papers (2020-05-21T16:08:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.