Related papers: Approximating the universal thermal climate index using sparse regression with orthogonal polynomials

Approximating the universal thermal climate index using sparse regression with orthogonal polynomials

URL: http://arxiv.org/abs/2508.11307v2
Date: Tue, 28 Oct 2025 23:09:56 GMT
Title: Approximating the universal thermal climate index using sparse regression with orthogonal polynomials
Authors: Sabin Roman, Gregor Skok, Ljupco Todorovski, Saso Dzeroski,
Abstract summary: This article explores novel data-driven modeling approaches for analyzing and approximating the Universal Climate Thermal Index (UTCI)<n>We investigate symbolic and sparse regression techniques as tools for interpretable and efficient function approximation.<n>We show that our models achieve significantly lower root-mean squared losses than the widely used sixth-degree squared benchmark.
Score: 4.017851211672872
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This article explores novel data-driven modeling approaches for analyzing and approximating the Universal Thermal Climate Index (UTCI), a physiologically-based metric integrating multiple atmospheric variables to assess thermal comfort. Given the nonlinear, multivariate structure of UTCI, we investigate symbolic and sparse regression techniques as tools for interpretable and efficient function approximation. In particular, we highlight the benefits of using orthogonal polynomial bases-such as Legendre polynomials-in sparse regression frameworks, demonstrating their advantages in stability, convergence, and hierarchical interpretability compared to standard polynomial expansions. We demonstrate that our models achieve significantly lower root-mean squared losses than the widely used sixth-degree polynomial benchmark-while using the same or fewer parameters. By leveraging Legendre polynomial bases, we construct models that efficiently populate a Pareto front of accuracy versus complexity and exhibit stable, hierarchical coefficient structures across varying model capacities. Training on just 20% of the data, our models generalize robustly to the remaining 80%, with consistent performance under bootstrapping. The decomposition effectively approximates the UTCI as a Fourier-like expansion in an orthogonal basis, yielding results near the theoretical optimum in the L2 (least squares) sense. We also connect these findings to the broader context of equation discovery in environmental modeling, referencing probabilistic grammar-based methods that enforce domain consistency and compactness in symbolic expressions. Taken together, these results illustrate how combining sparsity, orthogonality, and symbolic structure enables robust, interpretable modeling of complex environmental indices like UTCI - and significantly outperforms the state-of-the-art approximation in both accuracy and efficiency.

Related papers

Loss-Complexity Landscape and Model Structure Functions [56.01537787608726]
We develop a framework for dualizing the Kolmogorov structure function $h_x(alpha)$.<n>We establish a mathematical analogy between information-theoretic constructs and statistical mechanics.<n>We explicitly prove the Legendre-Fenchel duality between the structure function and free energy.
arXiv Detail & Related papers (2025-07-17T21:31:45Z)
Identifiable Convex-Concave Regression via Sub-gradient Regularised Least Squares [1.9580473532948397]
We propose a novel nonparametric regression method that models complex input-relationships as the sum of convex and concave components.<n>The method-ICCNLS-decomposes sub-constrained shape-constrained additive decomposition.
arXiv Detail & Related papers (2025-06-22T15:53:12Z)
Latent Semantic Consensus For Deterministic Geometric Model Fitting [109.44565542031384]
We propose an effective method called Latent Semantic Consensus (LSC) LSC formulates the model fitting problem into two latent semantic spaces based on data points and model hypotheses. LSC is able to provide consistent and reliable solutions within only a few milliseconds for general multi-structural model fitting.
arXiv Detail & Related papers (2024-03-11T05:35:38Z)
Sample Complexity Characterization for Linear Contextual MDPs [67.79455646673762]
Contextual decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable. CMDPs serve as an important framework to model many real-world applications with time-varying environments. We study CMDPs under two linear function approximation models: Model I with context-varying representations and common linear weights for all contexts; and Model II with common representations for all contexts and context-varying linear weights.
arXiv Detail & Related papers (2024-02-05T03:25:04Z)
TMPNN: High-Order Polynomial Regression Based on Taylor Map Factorization [0.0]
The paper presents a method for constructing a high-order regression based on the Taylor map factorization. By benchmarking on UCI open access datasets, we demonstrate that the proposed method performs comparable to the state-of-the-art regression methods.
arXiv Detail & Related papers (2023-07-30T01:52:00Z)
DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion [66.21290235237808]
We introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states. We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs. Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks.
arXiv Detail & Related papers (2023-01-23T15:18:54Z)
Factorized Fusion Shrinkage for Dynamic Relational Data [16.531262817315696]
We consider a factorized fusion shrinkage model in which all decomposed factors are dynamically shrunk towards group-wise fusion structures. The proposed priors enjoy many favorable properties in comparison and clustering of the estimated dynamic latent factors. We present a structured mean-field variational inference framework that balances optimal posterior inference with computational scalability.
arXiv Detail & Related papers (2022-09-30T21:03:40Z)
Latent Space Model for Higher-order Networks and Generalized Tensor Decomposition [18.07071669486882]
We introduce a unified framework, formulated as general latent space models, to study complex higher-order network interactions. We formulate the relationship between the latent positions and the observed data via a generalized multilinear kernel as the link function. We demonstrate the effectiveness of our method on synthetic data.
arXiv Detail & Related papers (2021-06-30T13:11:17Z)
Joint Network Topology Inference via Structured Fusion Regularization [70.30364652829164]
Joint network topology inference represents a canonical problem of learning multiple graph Laplacian matrices from heterogeneous graph signals. We propose a general graph estimator based on a novel structured fusion regularization. We show that the proposed graph estimator enjoys both high computational efficiency and rigorous theoretical guarantee.
arXiv Detail & Related papers (2021-03-05T04:42:32Z)
Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores) For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training. We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z)
Analysis of Bayesian Inference Algorithms by the Dynamical Functional Approach [2.8021833233819486]
We analyze an algorithm for approximate inference with large Gaussian latent variable models in a student-trivial scenario. For the case of perfect data-model matching, the knowledge of static order parameters derived from the replica method allows us to obtain efficient algorithmic updates.
arXiv Detail & Related papers (2020-01-14T17:22:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.