Evaluating Time-Dependent Methods and Seasonal Effects in Code Technical Debt Prediction
- URL: http://arxiv.org/abs/2408.08095v1
- Date: Thu, 15 Aug 2024 11:39:58 GMT
- Title: Evaluating Time-Dependent Methods and Seasonal Effects in Code Technical Debt Prediction
- Authors: Mikel Robredo, Nyyti Saarimaki, Davide Taibi, Rafael Penaloza, Valentina Lenarduzzi,
- Abstract summary: This study aims to evaluate the impact of considering time-dependent techniques as well as seasonal effects in temporal data.
We trained 11 prediction models using the commit history of 31 open-source projects developed with Java.
- Score: 6.616501747443831
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Code Technical Debt prediction has become a popular research niche in recent software engineering literature. Technical Debt is an important metric in software projects as it measures professionals' effort to clean the code. Therefore, predicting its future behavior becomes a crucial task. However, no well-defined and consistent approach can completely capture the features that impact the evolution of Code Technical Debt. The goal of this study is to evaluate the impact of considering time-dependent techniques as well as seasonal effects in temporal data in the prediction performance within the context of Code Technical Debt. The study adopts existing, yet not extensively adopted, time-dependent prediction techniques and compares their prediction performance to commonly used Machine Learning models. Further, the study strengthens the evaluation of time-dependent methods by extending the analysis to capture the impact of seasonality in Code Technical Debt data. We trained 11 prediction models using the commit history of 31 open-source projects developed with Java. We predicted the future observations of the SQALE index to evaluate their predictive performance. Our study confirms the positive impact of considering time-dependent techniques. The adopted multivariate time series analysis model ARIMAX overcame the rest of the adopted models. Incorporating seasonal effects led to an enhancement in the predictive performance of the adopted time-dependent techniques. However, the impact of this effect was found to be relatively modest. The findings of this study corroborate our position in favor of implementing techniques that capture the existing time dependence within historical data of software metrics, specifically in the context of this study, namely, Code Technical Debt. This necessitates the utilization of techniques that can effectively address this evidence.
Related papers
- Benchmarking Graph Conformal Prediction: Empirical Analysis, Scalability, and Theoretical Insights [6.801587574420671]
Conformal prediction has become increasingly popular for quantifying the uncertainty associated with machine learning models.
Recent work in graph uncertainty quantification has built upon this approach for conformal graph prediction.
We analyze the design choices made in the literature and discuss the tradeoffs associated with existing methods.
arXiv Detail & Related papers (2024-09-26T23:13:51Z) - Systematic literature review on forecasting and prediction of technical debt evolution [0.0]
Technical debt (TD) refers to the additional costs incurred due to compromises in software quality.
This study aims to explore existing knowledge in software engineering to gain insights into approaches proposed in research and industry.
arXiv Detail & Related papers (2024-06-17T18:50:37Z) - Deep learning for precipitation nowcasting: A survey from the perspective of time series forecasting [4.5424061912112474]
This paper reviews recent progress in time series precipitation forecasting models using deep learning.
We categorize forecasting models into textitrecursive and textitmultiple strategies based on their approaches to predict future frames.
We evaluate current deep learning-based models for precipitation forecasting on a public benchmark, discuss their limitations and challenges, and present some promising research directions.
arXiv Detail & Related papers (2024-06-07T12:07:09Z) - Selective Temporal Knowledge Graph Reasoning [70.11788354442218]
Temporal Knowledge Graph (TKG) aims to predict future facts based on given historical ones.
Existing TKG reasoning models are unable to abstain from predictions they are uncertain.
We propose an abstention mechanism for TKG reasoning, which helps the existing models make selective, instead of indiscriminate, predictions.
arXiv Detail & Related papers (2024-04-02T06:56:21Z) - Learning-Augmented Algorithms with Explicit Predictors [67.02156211760415]
Recent advances in algorithmic design show how to utilize predictions obtained by machine learning models from past and present data.
Prior research in this context was focused on a paradigm where the predictor is pre-trained on past data and then used as a black box.
In this work, we unpack the predictor and integrate the learning problem it gives rise for within the algorithmic challenge.
arXiv Detail & Related papers (2024-03-12T08:40:21Z) - Performative Time-Series Forecasting [71.18553214204978]
We formalize performative time-series forecasting (PeTS) from a machine-learning perspective.
We propose a novel approach, Feature Performative-Shifting (FPS), which leverages the concept of delayed response to anticipate distribution shifts.
We conduct comprehensive experiments using multiple time-series models on COVID-19 and traffic forecasting tasks.
arXiv Detail & Related papers (2023-10-09T18:34:29Z) - DeepVol: Volatility Forecasting from High-Frequency Data with Dilated Causal Convolutions [53.37679435230207]
We propose DeepVol, a model based on Dilated Causal Convolutions that uses high-frequency data to forecast day-ahead volatility.
Our empirical results suggest that the proposed deep learning-based approach effectively learns global features from high-frequency data.
arXiv Detail & Related papers (2022-09-23T16:13:47Z) - Logic Constraints to Feature Importances [17.234442722611803]
"Black box" nature of AI models is often a limit for a reliable application in high-stakes fields like diagnostic techniques, autonomous guide, etc.
Recent works have shown that an adequate level of interpretability could enforce the more general concept of model trustworthiness.
The basic idea of this paper is to exploit the human prior knowledge of the features' importance for a specific task, in order to coherently aid the phase of the model's fitting.
arXiv Detail & Related papers (2021-10-13T09:28:38Z) - Back2Future: Leveraging Backfill Dynamics for Improving Real-time
Predictions in Future [73.03458424369657]
In real-time forecasting in public health, data collection is a non-trivial and demanding task.
'Backfill' phenomenon and its effect on model performance has been barely studied in the prior literature.
We formulate a novel problem and neural framework Back2Future that aims to refine a given model's predictions in real-time.
arXiv Detail & Related papers (2021-06-08T14:48:20Z) - Learning Prediction Intervals for Model Performance [1.433758865948252]
We propose a method to compute prediction intervals for model performance.
We evaluate our approach across a wide range of drift conditions and show substantial improvement over competitive baselines.
arXiv Detail & Related papers (2020-12-15T21:32:03Z) - Counterfactual Predictions under Runtime Confounding [74.90756694584839]
We study the counterfactual prediction task in the setting where all relevant factors are captured in the historical data.
We propose a doubly-robust procedure for learning counterfactual prediction models in this setting.
arXiv Detail & Related papers (2020-06-30T15:49:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.