Related papers: Early stopping by correlating online indicators in neural networks

Early stopping by correlating online indicators in neural networks

URL: http://arxiv.org/abs/2402.02513v1
Date: Sun, 4 Feb 2024 14:57:20 GMT
Title: Early stopping by correlating online indicators in neural networks
Authors: Manuel Vilares Ferro, Yerai Doval Mosquera, Francisco J. Ribadas Pena, Victor M. Darriba Bilbao
Abstract summary: We propose a novel technique to identify overfitting phenomena when training the learner. Our proposal exploits the correlation over time in a collection of online indicators. As opposed to previous approaches focused on a single criterion, we take advantage of subsidiarities between independent assessments.
Score: 0.24578723416255746
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: In order to minimize the generalization error in neural networks, a novel technique to identify overfitting phenomena when training the learner is formally introduced. This enables support of a reliable and trustworthy early stopping condition, thus improving the predictive power of that type of modeling. Our proposal exploits the correlation over time in a collection of online indicators, namely characteristic functions for indicating if a set of hypotheses are met, associated with a range of independent stopping conditions built from a canary judgment to evaluate the presence of overfitting. That way, we provide a formal basis for decision making in terms of interrupting the learning process. As opposed to previous approaches focused on a single criterion, we take advantage of subsidiarities between independent assessments, thus seeking both a wider operating range and greater diagnostic reliability. With a view to illustrating the effectiveness of the halting condition described, we choose to work in the sphere of natural language processing, an operational continuum increasingly based on machine learning. As a case study, we focus on parser generation, one of the most demanding and complex tasks in the domain. The selection of cross-validation as a canary function enables an actual comparison with the most representative early stopping conditions based on overfitting identification, pointing to a promising start toward an optimal bias and variance control.

Related papers

Integrating Response Time and Attention Duration in Bayesian Preference Learning for Multiple Criteria Decision Aiding [2.9457161327910693]
We introduce a multiple criteria Bayesian preference learning framework incorporating behavioral cues for decision aiding. The framework integrates pairwise comparisons, response time, and attention duration to deepen insights into decision-making processes.
arXiv Detail & Related papers (2025-04-21T08:01:44Z)
Contextual Similarity Distillation: Ensemble Uncertainties with a Single Model [5.624791703748109]
Uncertainty quantification is a critical aspect of reinforcement learning and deep learning. We propose contextual similarity distillation, a novel approach that explicitly estimates the variance of an ensemble of deep neural networks with a single model. We empirically validate our method across a variety of out-of-distribution detection benchmarks and sparse-reward reinforcement learning environments.
arXiv Detail & Related papers (2025-03-14T12:09:58Z)
Adaptive Cascading Network for Continual Test-Time Adaptation [12.718826132518577]
We study the problem of continual test-time adaption where the goal is to adapt a source pre-trained model to a sequence of unlabelled target domains at test time. Existing methods on test-time training suffer from several limitations.
arXiv Detail & Related papers (2024-07-17T01:12:57Z)
Online Decision Mediation [72.80902932543474]
Consider learning a decision support assistant to serve as an intermediary between (oracle) expert behavior and (imperfect) human behavior. In clinical diagnosis, fully-autonomous machine behavior is often beyond ethical affordances.
arXiv Detail & Related papers (2023-10-28T05:59:43Z)
Large Class Separation is not what you need for Relational Reasoning-based OOD Detection [12.578844450586]
Out-Of-Distribution (OOD) detection methods provide a solution by identifying semantic novelty. Most of these methods leverage a learning stage on the known data, which means training (or fine-tuning) a model to capture the concept of normality. A viable alternative is that of evaluating similarities in the embedding space produced by large pre-trained models without any further learning effort.
arXiv Detail & Related papers (2023-07-12T14:10:15Z)
When Does Confidence-Based Cascade Deferral Suffice? [69.28314307469381]
Cascades are a classical strategy to enable inference cost to vary adaptively across samples. A deferral rule determines whether to invoke the next classifier in the sequence, or to terminate prediction. Despite being oblivious to the structure of the cascade, confidence-based deferral often works remarkably well in practice.
arXiv Detail & Related papers (2023-07-06T04:13:57Z)
Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks. The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data. Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z)
Improving Adaptive Conformal Prediction Using Self-Supervised Learning [72.2614468437919]
We train an auxiliary model with a self-supervised pretext task on top of an existing predictive model and use the self-supervised error as an additional feature to estimate nonconformity scores. We empirically demonstrate the benefit of the additional information using both synthetic and real data on the efficiency (width), deficit, and excess of conformal prediction intervals.
arXiv Detail & Related papers (2023-02-23T18:57:14Z)
Robust Fitted-Q-Evaluation and Iteration under Sequentially Exogenous Unobserved Confounders [16.193776814471768]
We study robust policy evaluation and policy optimization in the presence of sequentially-exogenous unobserved confounders. We provide sample complexity bounds, insights, and show effectiveness both in simulations and on real-world longitudinal healthcare data of treating sepsis.
arXiv Detail & Related papers (2023-02-01T18:40:53Z)
Deep Learning Methods for Proximal Inference via Maximum Moment Restriction [0.0]
We introduce a flexible and scalable method based on a deep neural network to estimate causal effects in the presence of unmeasured confounding. Our method achieves state of the art performance on two well-established proximal inference benchmarks.
arXiv Detail & Related papers (2022-05-19T19:51:42Z)
Masked prediction tasks: a parameter identifiability view [49.533046139235466]
We focus on the widely used self-supervised learning method of predicting masked tokens. We show that there is a rich landscape of possibilities, out of which some prediction tasks yield identifiability, while others do not.
arXiv Detail & Related papers (2022-02-18T17:09:32Z)
A Low Rank Promoting Prior for Unsupervised Contrastive Learning [108.91406719395417]
We construct a novel probabilistic graphical model that effectively incorporates the low rank promoting prior into the framework of contrastive learning. Our hypothesis explicitly requires that all the samples belonging to the same instance class lie on the same subspace with small dimension. Empirical evidences show that the proposed algorithm clearly surpasses the state-of-the-art approaches on multiple benchmarks.
arXiv Detail & Related papers (2021-08-05T15:58:25Z)
Causal Modeling with Stochastic Confounders [11.881081802491183]
This work extends causal inference with confounders. We propose a new approach to variational estimation for causal inference based on a representer theorem with a random input space.
arXiv Detail & Related papers (2020-04-24T00:34:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.