Related papers: Learning with the Nash-Sutcliffe loss

Learning with the Nash-Sutcliffe loss

URL: http://arxiv.org/abs/2603.00968v1
Date: Sun, 01 Mar 2026 07:43:28 GMT
Title: Learning with the Nash-Sutcliffe loss
Authors: Hristos Tyralis, Georgia Papacharalampous,
Abstract summary: We introduce Nash-Sutcliffe linear regression, a multi-dimensional model estimated by minimizing the average $L_textNS$.<n>Our results establish a decision-theoretic foundation for $textNSE$-based model estimation and forecast evaluation in large datasets.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The Nash-Sutcliffe efficiency ($\text{NSE}$) is a widely used, positively oriented relative measure for evaluating forecasts across multiple time series. However, it lacks a decision-theoretic foundation for this purpose. To address this, we examine its negatively oriented counterpart, which we refer to as Nash-Sutcliffe loss, defined as $L_{\text{NS}} = 1 - \text{NSE}$. We prove that $L_{\text{NS}}$ is strictly consistent for an elicitable and identifiable multi-dimensional functional, which we name the Nash-Sutcliffe functional. This functional is a data-weighted component-wise mean. The common practice of maximizing the average NSE across multiple series is the sample analog of minimizing the expected $L_{\text{NS}}$. Consequently, this operation implicitly assumes that all series originate from a single non-stationary, stochastic process. We introduce Nash-Sutcliffe linear regression, a multi-dimensional model estimated by minimizing the average $L_{\text{NS}}$, which reduces to a data-weighted least squares formulation. By reorienting the sample average loss function, we extend the previously proposed evaluation and estimation framework to forecasting multiple stationary dependent time series with differing stochastic properties. This constitutes a more natural empirical implementation of the $\text{NSE}$ than the earlier formulation. Our results establish a decision-theoretic foundation for $\text{NSE}$-based model estimation and forecast evaluation in large datasets, while further clarifying the benefits of global over local machine learning models.

Related papers

Statistical Inference and Learning for Shapley Additive Explanations (SHAP) [20.663970002208846]
The SHAP (short for Shapley additive explanation) framework has become an essential tool for attributing importance to variables in predictive tasks.<n>Despite their ubiquity, there do not exist approaches for performing statistical inference on these quantities.<n>We show that, by treating the SHAP curve as a nuisance function that must be estimated from data, one can reliably constructally normal estimates of the $p$th powers of SHAP.
arXiv Detail & Related papers (2026-02-11T05:01:47Z)
FedSVD: Adaptive Orthogonalization for Private Federated Learning with LoRA [68.44043212834204]
Low-Rank Adaptation (LoRA) is widely used for efficient fine-tuning of language models in learning (FL)<n>Low-Rank Adaptation (LoRA) is widely used for efficient fine-tuning of language models in learning (FL)
arXiv Detail & Related papers (2025-05-19T07:32:56Z)
Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure. We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z)
A Specialized Semismooth Newton Method for Kernel-Based Optimal Transport [92.96250725599958]
Kernel-based optimal transport (OT) estimators offer an alternative, functional estimation procedure to address OT problems from samples. We show that our SSN method achieves a global convergence rate of $O (1/sqrtk)$, and a local quadratic convergence rate under standard regularity conditions.
arXiv Detail & Related papers (2023-10-21T18:48:45Z)
Sample-Efficient Linear Representation Learning from Non-IID Non-Isotropic Data [4.971690889257356]
We introduce an adaptation of the alternating minimization-descent scheme proposed by Collins and Nayer and Vaswani. We show that vanilla alternating-minimization descent fails catastrophically even for iid, but mildly non-isotropic data. Our analysis unifies and generalizes prior work, and provides a flexible framework for a wider range of applications.
arXiv Detail & Related papers (2023-08-08T17:56:20Z)
Retire: Robust Expectile Regression in High Dimensions [3.9391041278203978]
Penalized quantile and expectile regression methods offer useful tools to detect heteroscedasticity in high-dimensional data. We propose and study (penalized) robust expectile regression (retire) We show that the proposed procedure can be efficiently solved by a semismooth Newton coordinate descent algorithm.
arXiv Detail & Related papers (2022-12-11T18:03:12Z)
Sharper Rates and Flexible Framework for Nonconvex SGD with Client and Data Sampling [64.31011847952006]
We revisit the problem of finding an approximately stationary point of the average of $n$ smooth and possibly non-color functions. We generalize the $smallsfcolorgreen$ so that it can provably work with virtually any sampling mechanism. We provide the most general and most accurate analysis of optimal bound in the smooth non-color regime.
arXiv Detail & Related papers (2022-06-05T21:32:33Z)
Stochastic regularized majorization-minimization with weakly convex and multi-convex surrogates [0.0]
We show that the first optimality gap of the proposed algorithm decays at the rate of the expected loss for various methods under nontens dependent data setting. We obtain first convergence point for various methods under nontens dependent data setting.
arXiv Detail & Related papers (2022-01-05T15:17:35Z)
Sharp Analysis of Random Fourier Features in Classification [9.383533125404755]
We show for the first time that random Fourier features classification can achieve $O(sqrtn)$ learning rate with only $Omega(sqrtn log n)$ features.
arXiv Detail & Related papers (2021-09-22T09:49:27Z)
SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets. Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z)
Estimation in Tensor Ising Models [5.161531917413708]
We consider the problem of estimating the natural parameter of the $p$-tensor Ising model given a single sample from the distribution on $N$ nodes. In particular, we show the $sqrt N$-consistency of the MPL estimate in the $p$-spin Sherrington-Kirkpatrick (SK) model. We derive the precise fluctuations of the MPL estimate in the special case of the $p$-tensor Curie-Weiss model.
arXiv Detail & Related papers (2020-08-29T00:06:58Z)
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction [63.41789556777387]
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a Markov decision process (MDP) We show that the number of samples needed to yield an entrywise $varepsilon$-accurate estimate of the Q-function is at most on the order of $frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$ up to some logarithmic factor.
arXiv Detail & Related papers (2020-06-04T17:51:00Z)
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model [50.38446482252857]
This paper is concerned with the sample efficiency of reinforcement learning, assuming access to a generative model (or simulator) We first consider $gamma$-discounted infinite-horizon Markov decision processes (MDPs) with state space $mathcalS$ and action space $mathcalA$. We prove that a plain model-based planning algorithm suffices to achieve minimax-optimal sample complexity given any target accuracy level.
arXiv Detail & Related papers (2020-05-26T17:53:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.