Related papers: Sharpness-Aware Minimization for Evolutionary Feature Construction in Regression

Sharpness-Aware Minimization for Evolutionary Feature Construction in Regression

URL: http://arxiv.org/abs/2405.06869v1
Date: Sat, 11 May 2024 02:03:11 GMT
Title: Sharpness-Aware Minimization for Evolutionary Feature Construction in Regression
Authors: Hengzhe Zhang, Qi Chen, Bing Xue, Wolfgang Banzhaf, Mengjie Zhang,
Abstract summary: We propose using sharpness-aware minimization in function space to discover symbolic features that exhibit robust performance. Experimental results on 58 real-world regression datasets show that our approach outperforms standard evolutionary feature construction.
Score: 11.760077969729055
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In recent years, genetic programming (GP)-based evolutionary feature construction has achieved significant success. However, a primary challenge with evolutionary feature construction is its tendency to overfit the training data, resulting in poor generalization on unseen data. In this research, we draw inspiration from PAC-Bayesian theory and propose using sharpness-aware minimization in function space to discover symbolic features that exhibit robust performance within a smooth loss landscape in the semantic space. By optimizing sharpness in conjunction with cross-validation loss, as well as designing a sharpness reduction layer, the proposed method effectively mitigates the overfitting problem of GP, especially when dealing with a limited number of instances or in the presence of label noise. Experimental results on 58 real-world regression datasets show that our approach outperforms standard GP as well as six state-of-the-art complexity measurement methods for GP in controlling overfitting. Furthermore, the ensemble version of GP with sharpness-aware minimization demonstrates superior performance compared to nine fine-tuned machine learning and symbolic regression algorithms, including XGBoost and LightGBM.

Related papers

PGU-SGP: A Pheno-Geno Unified Surrogate Genetic Programming For Real-life Container Terminal Truck Scheduling [7.678307721780809]
This paper proposes a pheno-geno unified surrogate GP algorithm, PGU-SGP, to enhance surrogate sample selection and fitness prediction. With the same training time, PGU-SGP significantly outperforms traditional GP and the state-of-the-art algorithm on most datasets.
arXiv Detail & Related papers (2025-04-15T15:19:42Z)
GAQAT: gradient-adaptive quantization-aware training for domain generalization [54.31450550793485]
We propose a novel Gradient-Adaptive Quantization-Aware Training (GAQAT) framework for DG. Our approach begins by identifying the scale-gradient conflict problem in low-precision quantization. Extensive experiments validate the effectiveness of the proposed GAQAT framework.
arXiv Detail & Related papers (2024-12-07T06:07:21Z)
Fast Graph Sharpness-Aware Minimization for Enhancing and Accelerating Few-Shot Node Classification [53.727688136434345]
Graph Neural Networks (GNNs) have shown superior performance in node classification. We present Fast Graph Sharpness-Aware Minimization (FGSAM) that integrates the rapid training of Multi-Layer Perceptrons with the superior performance of GNNs. Our proposed algorithm outperforms the standard SAM with lower computational costs in FSNC tasks.
arXiv Detail & Related papers (2024-10-22T09:33:29Z)
Haste Makes Waste: A Simple Approach for Scaling Graph Neural Networks [37.41604955004456]
Graph neural networks (GNNs) have demonstrated remarkable success in graph representation learning. Various sampling approaches have been proposed to scale GNNs to applications with large-scale graphs.
arXiv Detail & Related papers (2024-10-07T18:29:02Z)
Regression under demographic parity constraints via unlabeled post-processing [5.762345156477737]
We present a general-purpose post-processing algorithm that generates predictions that meet the demographic parity. Unlike prior methods, our approach is fully theory-driven. We require precise control over the gradient norm of the convex function. Our algorithm is backed by finite-sample analysis and post-processing bounds, with experimental results validating our theoretical findings.
arXiv Detail & Related papers (2024-07-22T08:11:58Z)
Reduced Jeffries-Matusita distance: A Novel Loss Function to Improve Generalization Performance of Deep Classification Models [0.0]
We introduce a distance called Reduced Jeffries-Matusita as a loss function for training deep classification models to reduce the over-fitting issue. The results show that the new distance measure stabilizes the training process significantly, enhances the generalization ability, and improves the performance of the models in the Accuracy and F1-score metrics.
arXiv Detail & Related papers (2024-03-13T10:51:38Z)
Random Linear Projections Loss for Hyperplane-Based Optimization in Neural Networks [22.348887008547653]
This work introduces Random Linear Projections (RLP) loss, a novel approach that enhances training efficiency by leveraging geometric relationships within the data. Our empirical evaluations, conducted across benchmark datasets and synthetic examples, demonstrate that neural networks trained with RLP loss outperform those trained with traditional loss functions.
arXiv Detail & Related papers (2023-11-21T05:22:39Z)
Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data. We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z)
Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs. Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs. Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z)
Sharpness-Aware Training for Free [163.1248341911413]
SharpnessAware Minimization (SAM) has shown that minimizing a sharpness measure, which reflects the geometry of the loss landscape, can significantly reduce the generalization error. Sharpness-Aware Training Free (SAF) mitigates the sharp landscape at almost zero computational cost over the base. SAF ensures the convergence to a flat minimum with improved capabilities.
arXiv Detail & Related papers (2022-05-27T16:32:43Z)
Incremental Ensemble Gaussian Processes [53.3291389385672]
We propose an incremental ensemble (IE-) GP framework, where an EGP meta-learner employs an it ensemble of GP learners, each having a unique kernel belonging to a prescribed kernel dictionary. With each GP expert leveraging the random feature-based approximation to perform online prediction and model update with it scalability, the EGP meta-learner capitalizes on data-adaptive weights to synthesize the per-expert predictions. The novel IE-GP is generalized to accommodate time-varying functions by modeling structured dynamics at the EGP meta-learner and within each GP learner.
arXiv Detail & Related papers (2021-10-13T15:11:25Z)
A Flatter Loss for Bias Mitigation in Cross-dataset Facial Age Estimation [37.107335288543624]
We advocate a cross-dataset protocol for age estimation benchmarking. We propose a novel loss function that is more effective for neural network training.
arXiv Detail & Related papers (2020-10-20T15:22:29Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.