Related papers: Revisiting Process versus Product Metrics: a Large Scale Analysis

Revisiting Process versus Product Metrics: a Large Scale Analysis

URL: http://arxiv.org/abs/2008.09569v3
Date: Tue, 26 Oct 2021 13:50:46 GMT
Title: Revisiting Process versus Product Metrics: a Large Scale Analysis
Authors: Suvodeep Majumder, Pranav Mody, Tim Menzies
Abstract summary: We recheck prior small-scale results using 722,471 commits from 700 Github projects. We find that some analytics in-the-small conclusions still hold when scaling up to analytics in-the-large. We warn that it is unwise to trust metric importance results from analytics in-the-small studies.
Score: 32.37197747513998
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Numerous methods can build predictive models from software data. However, what methods and conclusions should we endorse as we move from analytics in-the-small (dealing with a handful of projects) to analytics in-the-large (dealing with hundreds of projects)? To answer this question, we recheck prior small-scale results (about process versus product metrics for defect prediction and the granularity of metrics) using 722,471 commits from 700 Github projects. We find that some analytics in-the-small conclusions still hold when scaling up to analytics in-the-large. For example, like prior work, we see that process metrics are better predictors for defects than product metrics (best process/product-based learners respectively achieve recalls of 98\%/44\% and AUCs of 95\%/54\%, median values). That said, we warn that it is unwise to trust metric importance results from analytics in-the-small studies since those change dramatically when moving to analytics in-the-large. Also, when reasoning in-the-large about hundreds of projects, it is better to use predictions from multiple models (since single model predictions can become confused and exhibit a high variance).

Related papers

Bug Destiny Prediction in Large Open-Source Software Repositories through Sentiment Analysis and BERT Topic Modeling [3.481985817302898]
We leverage features available before a bug is resolved to enhance predictive accuracy. Our methodology incorporates sentiment analysis to derive both an emotionality score and a sentiment classification. Results demonstrate that sentiment analysis serves as a valuable predictor of a bug's eventual outcome.
arXiv Detail & Related papers (2025-04-22T15:18:14Z)
Critical Example Mining for Vehicle Trajectory Prediction using Flow-based Generative Models [10.40439055916036]
This paper proposes a data-driven approach to estimate the rareness of the trajectories. By combining the rareness estimation of observations with whole trajectories, the proposed method effectively identifies a subset of data that is relatively hard to predict.
arXiv Detail & Related papers (2024-10-21T15:02:30Z)
Small Effect Sizes in Malware Detection? Make Harder Train/Test Splits! [51.668411293817464]
Industry practitioners care about small improvements in malware detection accuracy because their models are deployed to hundreds of millions of machines. Academic research is often restrained to public datasets on the order of ten thousand samples. We devise an approach to generate a benchmark of difficulty from a pool of available samples.
arXiv Detail & Related papers (2023-12-25T21:25:55Z)
Variational Factorization Machines for Preference Elicitation in Large-Scale Recommender Systems [17.050774091903552]
We propose a variational formulation of factorization machines (FMs) that can be easily optimized using standard mini-batch descent gradient. Our algorithm learns an approximate posterior distribution over the user and item parameters, which leads to confidence intervals over the predictions. We show, using several datasets, that it has comparable or better performance than existing methods in terms of prediction accuracy.
arXiv Detail & Related papers (2022-12-20T00:06:28Z)
Azimuth: Systematic Error Analysis for Text Classification [3.1679600401346706]
Azimuth is an open-source tool to perform error analysis for text classification. We propose an approach comprising dataset analysis and model quality assessment.
arXiv Detail & Related papers (2022-12-16T01:10:41Z)
Why we should respect analysis results as data [0.0]
It is commonly overlooked that analyzing clinical study data also produces data in the form of results. Although integrating and putting findings into context is a cornerstone of scientific work, analysis results are often neglected as a data source. We propose a solution to "calculate once, use many times" by combining analysis results standards with a common data model.
arXiv Detail & Related papers (2022-04-21T08:34:07Z)
Similarities and Differences between Machine Learning and Traditional Advanced Statistical Modeling in Healthcare Analytics [0.6999740786886537]
Machine learning and statistical modeling are complementary, based on similar mathematical principles. Good analysts and data scientists should be well versed in both techniques and their proper application.
arXiv Detail & Related papers (2022-01-07T14:36:46Z)
Flexible Model Aggregation for Quantile Regression [92.63075261170302]
Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions. We investigate methods for aggregating any number of conditional quantile models. All of the models we consider in this paper can be fit using modern deep learning toolkits.
arXiv Detail & Related papers (2021-02-26T23:21:16Z)
Generalized Matrix Factorization: efficient algorithms for fitting generalized linear latent variable models to large data arrays [62.997667081978825]
Generalized Linear Latent Variable models (GLLVMs) generalize such factor models to non-Gaussian responses. Current algorithms for estimating model parameters in GLLVMs require intensive computation and do not scale to large datasets. We propose a new approach for fitting GLLVMs to high-dimensional datasets, based on approximating the model using penalized quasi-likelihood.
arXiv Detail & Related papers (2020-10-06T04:28:19Z)
Interpolation and Learning with Scale Dependent Kernels [91.41836461193488]
We study the learning properties of nonparametric ridge-less least squares. We consider the common case of estimators defined by scale dependent kernels.
arXiv Detail & Related papers (2020-06-17T16:43:37Z)
Machine learning for causal inference: on the use of cross-fit estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties. We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE) When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.