Related papers: Dividable Configuration Performance Learning

Dividable Configuration Performance Learning

URL: http://arxiv.org/abs/2409.07629v2
Date: Sun, 3 Nov 2024 17:30:19 GMT
Title: Dividable Configuration Performance Learning
Authors: Jingzhi Gong, Tao Chen, Rami Bahsoon,
Abstract summary: We propose a model-agnostic and sparsity-robust framework for predicting configuration performance, dubbed DaL. DaL is based on the new paradigm of dividable learning that builds a model via "divide-and-learn"
Score: 4.949726352498762
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Machine/deep learning models have been widely adopted for predicting the configuration performance of software systems. However, a crucial yet unaddressed challenge is how to cater for the sparsity inherited from the configuration landscape: the influence of configuration options (features) and the distribution of data samples are highly sparse. In this paper, we propose a model-agnostic and sparsity-robust framework for predicting configuration performance, dubbed DaL, based on the new paradigm of dividable learning that builds a model via "divide-and-learn". To handle sample sparsity, the samples from the configuration landscape are divided into distant divisions, for each of which we build a sparse local model, e.g., regularized Hierarchical Interaction Neural Network, to deal with the feature sparsity. A newly given configuration would then be assigned to the right model of division for the final prediction. Further, DaL adaptively determines the optimal number of divisions required for a system and sample size without any extra training or profiling. Experiment results from 12 real-world systems and five sets of training data reveal that, compared with the state-of-the-art approaches, DaL performs no worse than the best counterpart on 44 out of 60 cases with up to 1.61x improvement on accuracy; requires fewer samples to reach the same/better accuracy; and producing acceptable training overhead. In particular, the mechanism that adapted the parameter d can reach the optimal value for 76.43% of the individual runs. The result also confirms that the paradigm of dividable learning is more suitable than other similar paradigms such as ensemble learning for predicting configuration performance. Practically, DaL considerably improves different global models when using them as the underlying local models, which further strengthens its flexibility.

Related papers

SPaRFT: Self-Paced Reinforcement Fine-Tuning for Large Language Models [51.74498855100541]
Large language models (LLMs) have shown strong reasoning capabilities when fine-tuned with reinforcement learning (RL)<n>We propose textbfSPaRFT, a self-paced learning framework that enables efficient learning based on the capability of the model being trained.
arXiv Detail & Related papers (2025-08-07T03:50:48Z)
AdaLRS: Loss-Guided Adaptive Learning Rate Search for Efficient Foundation Model Pretraining [12.630306478872043]
We propose textbfAdaLRS, a plug-in-and-play adaptive learning rate search algorithm that conducts online optimal learning rate search.<n>Experiments show that AdaLRS adjusts suboptimal learning rates to the neighborhood of optimum with marked efficiency and effectiveness.
arXiv Detail & Related papers (2025-06-16T09:14:01Z)
Combatting Dimensional Collapse in LLM Pre-Training Data via Diversified File Selection [65.96556073745197]
DiverSified File selection algorithm (DiSF) is proposed to select the most decorrelated text files in the feature space. DiSF saves 98.5% of 590M training files in SlimPajama, outperforming the full-data pre-training within a 50B training budget.
arXiv Detail & Related papers (2025-04-29T11:13:18Z)
Learnable Sparse Customization in Heterogeneous Edge Computing [27.201987866208484]
We propose Learnable Personalized Sparsification for heterogeneous Federated learning (FedLPS) FedLPS learns the importance of model units on local data representation and derives an importance-based sparse pattern with minimals to accurately extract personalized data features. Experiments show that FedLPS outperforms status quo approaches in accuracy and training costs.
arXiv Detail & Related papers (2024-12-10T06:14:31Z)
Just How Flexible are Neural Networks in Practice? [89.80474583606242]
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters. In practice, however, we only find solutions via our training procedure, including the gradient and regularizers, limiting flexibility.
arXiv Detail & Related papers (2024-06-17T12:24:45Z)
Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios. We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples. Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z)
A Lightweight Measure of Classification Difficulty from Application Dataset Characteristics [4.220363193932374]
We propose an efficient cosine similarity-based classification difficulty measure S. It is calculated from the number of classes and intra- and inter-class similarity metrics of the dataset. We show how a practitioner can use this measure to help select an efficient model 6 to 29x faster than through repeated training and testing.
arXiv Detail & Related papers (2024-04-09T03:27:09Z)
Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data. One key challenge in federated learning is to handle non-identically distributed data across the clients. We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z)
Scaling Laws for Sparsely-Connected Foundation Models [70.41266138010657]
We explore the impact of parameter sparsity on the scaling behavior of Transformers trained on massive datasets. We identify the first scaling law describing the relationship between weight sparsity, number of non-zero parameters, and amount of training data.
arXiv Detail & Related papers (2023-09-15T16:29:27Z)
Adaptive Model Pruning and Personalization for Federated Learning over Wireless Networks [72.59891661768177]
Federated learning (FL) enables distributed learning across edge devices while protecting data privacy. We consider a FL framework with partial model pruning and personalization to overcome these challenges. This framework splits the learning model into a global part with model pruning shared with all devices to learn data representations and a personalized part to be fine-tuned for a specific device.
arXiv Detail & Related papers (2023-09-04T21:10:45Z)
Predicting Software Performance with Divide-and-Learn [3.635696352780227]
We propose an approach based on the concept of 'divide-and-learn', dubbed DaL. Experiment results from eight real-world systems and five sets of training data reveal that DaL performs no worse than the best counterpart on 33 out of 40 cases.
arXiv Detail & Related papers (2023-06-11T11:16:27Z)
CLIPood: Generalizing CLIP to Out-of-Distributions [73.86353105017076]
Contrastive language-image pre-training (CLIP) models have shown impressive zero-shot ability, but the further adaptation of CLIP on downstream tasks undesirably degrades OOD performances. We propose CLIPood, a fine-tuning method that can adapt CLIP models to OOD situations where both domain shifts and open classes may occur on unseen test data. Experiments on diverse datasets with different OOD scenarios show that CLIPood consistently outperforms existing generalization techniques.
arXiv Detail & Related papers (2023-02-02T04:27:54Z)
SCAI: A Spectral data Classification framework with Adaptive Inference for the IoT platform [0.0]
We propose a Spectral data Classification framework with Adaptive Inference. Specifically, to allocate different computations for different samples while better exploiting the collaboration among different devices. To the best of our knowledge, this paper is the first attempt to conduct optimization by adaptive inference for spectral detection under the IoT platform.
arXiv Detail & Related papers (2022-06-24T09:22:52Z)
Wavelet-Based Hybrid Machine Learning Model for Out-of-distribution Internet Traffic Prediction [3.689539481706835]
This paper investigates machine learning performances using eXtreme Gradient Boosting, Light Gradient Boosting Machine, Gradient Descent, Gradient Boosting Regressor, Cat Regressor. We propose a hybrid machine learning model integrating wavelet decomposition for improving out-of-distribution prediction.
arXiv Detail & Related papers (2022-05-09T14:34:42Z)
Auto-Ensemble: An Adaptive Learning Rate Scheduling based Deep Learning Model Ensembling [11.324407834445422]
This paper proposes Auto-Ensemble (AE) to collect checkpoints of deep learning model and ensemble them automatically. The advantage of this method is to make the model converge to various local optima by scheduling the learning rate in once training.
arXiv Detail & Related papers (2020-03-25T08:17:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.