Is Hyper-Parameter Optimization Different for Software Analytics?
- URL: http://arxiv.org/abs/2401.09622v3
- Date: Mon, 25 Nov 2024 18:55:38 GMT
- Title: Is Hyper-Parameter Optimization Different for Software Analytics?
- Authors: Rahul Yedida, Tim Menzies,
- Abstract summary: SE data can have "smoother" boundaries between classes.
SMOOTHIE runs faster and predicts better on the SE data--but ties on non-SE data with the AI tool.
- Score: 11.85735565104864
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Yes. SE data can have "smoother" boundaries between classes (compared to traditional AI data sets). To be more precise, the magnitude of the second derivative of the loss function found in SE data is typically much smaller. A new hyper-parameter optimizer, called SMOOTHIE, can exploit this idiosyncrasy of SE data. We compare SMOOTHIE and a state-of-the-art AI hyper-parameter optimizer on three tasks: (a) GitHub issue lifetime prediction (b) detecting static code warnings false alarm; (c) defect prediction. For completeness, we also show experiments on some standard AI datasets. SMOOTHIE runs faster and predicts better on the SE data--but ties on non-SE data with the AI tool. Hence we conclude that SE data can be different to other kinds of data; and those differences mean that we should use different kinds of algorithms for our data. To support open science and other researchers working in this area, all our scripts and datasets are available on-line at https://github.com/yrahul3910/smoothness-hpo/.
Related papers
- Be aware of overfitting by hyperparameter optimization! [0.0]
We show that hyperparameter optimization did not always result in better models, possibly due to overfitting when using the same statistical measures.
We also extended the previous analysis by adding a representation learning method based on Natural Language Processing of smiles called Transformer CNN.
We show that across all analyzed sets using exactly the same protocol, Transformer CNN provided better results than graph-based methods for 26 out of 28 pairwise comparisons.
arXiv Detail & Related papers (2024-07-30T12:45:05Z) - Scaling Laws for Data Filtering -- Data Curation cannot be Compute Agnostic [99.3682210827572]
Vision-language models (VLMs) are trained for thousands of GPU hours on carefully curated web datasets.
Data curation strategies are typically developed agnostic of the available compute for training.
We introduce neural scaling laws that account for the non-homogeneous nature of web data.
arXiv Detail & Related papers (2024-04-10T17:27:54Z) - Learning from Very Little Data: On the Value of Landscape Analysis for
Predicting Software Project Health [13.19204187502255]
This paper only explores the application of niSNEAK to project health. That said, we see nothing in principle that prevents the application of this technique to a wider range of problems.
arXiv Detail & Related papers (2023-01-16T19:27:16Z) - Semantic Preserving Adversarial Attack Generation with Autoencoder and
Genetic Algorithm [29.613411948228563]
Little noises can fool state-of-the-art models into making incorrect predictions.
We propose a black-box attack, which modifies latent features of data extracted by an autoencoder.
We trained autoencoders on MNIST and CIFAR-10 datasets and found optimal adversarial perturbations using a genetic algorithm.
arXiv Detail & Related papers (2022-08-25T17:27:26Z) - Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise
Learning [65.54757265434465]
Pairwise learning refers to learning tasks where the loss function depends on a pair instances.
Online descent (OGD) is a popular approach to handle streaming data in pairwise learning.
In this paper, we propose simple and online descent to methods for pairwise learning.
arXiv Detail & Related papers (2021-11-23T18:10:48Z) - Self-Supervised Neural Architecture Search for Imbalanced Datasets [129.3987858787811]
Neural Architecture Search (NAS) provides state-of-the-art results when trained on well-curated datasets with annotated labels.
We propose a NAS-based framework that bears the threefold contributions: (a) we focus on the self-supervised scenario, where no labels are required to determine the architecture, and (b) we assume the datasets are imbalanced.
arXiv Detail & Related papers (2021-09-17T14:56:36Z) - Discriminative-Generative Dual Memory Video Anomaly Detection [81.09977516403411]
Recently, people tried to use a few anomalies for video anomaly detection (VAD) instead of only normal data during the training process.
We propose a DiscRiminative-gEnerative duAl Memory (DREAM) anomaly detection model to take advantage of a few anomalies and solve data imbalance.
arXiv Detail & Related papers (2021-04-29T15:49:01Z) - MementoML: Performance of selected machine learning algorithm
configurations on OpenML100 datasets [5.802346990263708]
We present the protocol of generating benchmark data describing the performance of different ML algorithms.
Data collected in this way is used to study the factors influencing the algorithm's performance.
arXiv Detail & Related papers (2020-08-30T13:13:52Z) - How to tune the RBF SVM hyperparameters?: An empirical evaluation of 18
search algorithms [4.394728504061753]
We propose 18 proposed search algorithms for 115 real-life binary data sets.
We find that Parss better searches with only a slight increase in time with respect to the same tree with with respect to the grid.
We also find that there are no significant differences among the different procedures to the best set of data when more than one is found by the search algorithms.
arXiv Detail & Related papers (2020-08-26T16:28:48Z) - New Oracle-Efficient Algorithms for Private Synthetic Data Release [52.33506193761153]
We present three new algorithms for constructing differentially private synthetic data.
The algorithms satisfy differential privacy even in the worst case.
Compared to the state-of-the-art method High-Dimensional Matrix Mechanism citeMcKennaMHM18, our algorithms provide better accuracy in the large workload.
arXiv Detail & Related papers (2020-07-10T15:46:05Z) - Least Squares Regression with Markovian Data: Fundamental Limits and
Algorithms [69.45237691598774]
We study the problem of least squares linear regression where the data-points are dependent and are sampled from a Markov chain.
We establish sharp information theoretic minimax lower bounds for this problem in terms of $tau_mathsfmix$.
We propose an algorithm based on experience replay--a popular reinforcement learning technique--that achieves a significantly better error rate.
arXiv Detail & Related papers (2020-06-16T04:26:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.