"Estimating software project effort using analogies": Reflections after 28 years
- URL: http://arxiv.org/abs/2501.14582v2
- Date: Thu, 30 Jan 2025 16:44:38 GMT
- Title: "Estimating software project effort using analogies": Reflections after 28 years
- Authors: Martin Shepperd,
- Abstract summary: The paper examines (i) what was achieved, (ii) what has endured and (iii) what could have been done differently with the benefit of retrospection.<n>The original study emphasised empirical validation with benchmarks, out-of-sample testing and data/tool sharing.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Background: This invited paper is the result of an invitation to write a retrospective article on a "TSE most influential paper" as part of the journal's 50th anniversary. Objective: To reflect on the progress of software engineering prediction research using the lens of a selected, highly cited research paper and 28 years of hindsight. Methods: The paper examines (i) what was achieved, (ii) what has endured and (iii) what could have been done differently with the benefit of retrospection. Conclusions: While many specifics of software project effort prediction have evolved, key methodological issues remain relevant. The original study emphasised empirical validation with benchmarks, out-of-sample testing and data/tool sharing. Four areas for improvement are identified: (i) stronger commitment to Open Science principles, (ii) focus on effect sizes and confidence intervals, (iii) reporting variability alongside typical results and (iv) more rigorous examination of threats to validity.
Related papers
- (Mis)Fitting: A Survey of Scaling Laws [52.598843243928584]
We discuss discrepancies in the conclusions that several prior works reach, on questions such as the optimal token to parameter ratio.
We survey over 50 papers that study scaling trends.
We propose a checklist for authors to consider while contributing to scaling law research.
arXiv Detail & Related papers (2025-02-26T09:27:54Z) - Qualitative Research Methods in Software Engineering: Past, Present, and Future [15.223983256335426]
The paper "Qualitative Methods in Empirical Studies of Software Engineering" was published in TSE in 1999.
It has been chosen as one of the most influential papers from the third decade of TSE's 50 years history.
arXiv Detail & Related papers (2025-02-11T03:25:58Z) - A Call for Critically Rethinking and Reforming Data Analysis in Empirical Software Engineering [5.687882380471718]
Concerns about the correct application of empirical methodologies have existed since the 2006 Dagstuhl seminar on Empirical Software Engineering.<n>We conducted a literature survey of 27,000 empirical studies, using LLMs to classify statistical methodologies as adequate or inadequate.<n>We selected 30 primary studies and held a workshop with 33 ESE experts to assess their ability to identify and resolve statistical issues.
arXiv Detail & Related papers (2025-01-22T09:05:01Z) - RelevAI-Reviewer: A Benchmark on AI Reviewers for Survey Paper Relevance [0.8089605035945486]
We propose RelevAI-Reviewer, an automatic system that conceptualizes the task of survey paper review as a classification problem.
We introduce a novel dataset comprised of 25,164 instances. Each instance contains one prompt and four candidate papers, each varying in relevance to the prompt.
We develop a machine learning (ML) model capable of determining the relevance of each paper and identifying the most pertinent one.
arXiv Detail & Related papers (2024-06-13T06:42:32Z) - Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data [89.2410799619405]
We introduce the Quantitative Reasoning with Data benchmark to evaluate Large Language Models' capability in statistical and causal reasoning with real-world data.
The benchmark comprises a dataset of 411 questions accompanied by data sheets from textbooks, online learning materials, and academic papers.
To compare models' quantitative reasoning abilities on data and text, we enrich the benchmark with an auxiliary set of 290 text-only questions, namely QRText.
arXiv Detail & Related papers (2024-02-27T16:15:03Z) - Chain-of-Factors Paper-Reviewer Matching [32.86512592730291]
We propose a unified model for paper-reviewer matching that jointly considers semantic, topic, and citation factors.
We demonstrate the effectiveness of our proposed Chain-of-Factors model in comparison with state-of-the-art paper-reviewer matching methods and scientific pre-trained language models.
arXiv Detail & Related papers (2023-10-23T01:29:18Z) - Deep Learning for Agile Effort Estimation Have We Solved the Problem
Yet? [7.808390209137859]
We perform a close replication and extension of a seminal work proposing the use of Deep Learning for agile effort estimation.
We benchmark Deep-SE against three baseline techniques and a previously proposed method to estimate agile software project development effort.
Using more data allows us to strengthen our confidence in the results and further mitigate the threat to the external validity of the study.
arXiv Detail & Related papers (2022-01-14T11:38:51Z) - Ranking Scientific Papers Using Preference Learning [48.78161994501516]
We cast it as a paper ranking problem based on peer review texts and reviewer scores.
We introduce a novel, multi-faceted generic evaluation framework for making final decisions based on peer reviews.
arXiv Detail & Related papers (2021-09-02T19:41:47Z) - A Survey on Text Classification: From Shallow to Deep Learning [83.47804123133719]
The last decade has seen a surge of research in this area due to the unprecedented success of deep learning.
This paper fills the gap by reviewing the state-of-the-art approaches from 1961 to 2021.
We create a taxonomy for text classification according to the text involved and the models used for feature extraction and classification.
arXiv Detail & Related papers (2020-08-02T00:09:03Z) - Sentiment Analysis Based on Deep Learning: A Comparative Study [69.09570726777817]
The study of public opinion can provide us with valuable information.
The efficiency and accuracy of sentiment analysis is being hindered by the challenges encountered in natural language processing.
This paper reviews the latest studies that have employed deep learning to solve sentiment analysis problems.
arXiv Detail & Related papers (2020-06-05T16:28:10Z) - Recognizing Families In the Wild: White Paper for the 4th Edition Data
Challenge [91.55319616114943]
This paper summarizes the supported tasks (i.e., kinship verification, tri-subject verification, and search & retrieval of missing children) in the Recognizing Families In the Wild (RFIW) evaluation.
The purpose of this paper is to describe the 2020 RFIW challenge, end-to-end, along with forecasts in promising future directions.
arXiv Detail & Related papers (2020-02-15T02:22:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.