Related papers: Large-scale information retrieval in software engineering -- an experience report from industrial application

Large-scale information retrieval in software engineering -- an experience report from industrial application

URL: http://arxiv.org/abs/2308.11750v1
Date: Tue, 22 Aug 2023 19:30:56 GMT
Title: Large-scale information retrieval in software engineering -- an experience report from industrial application
Authors: Michael Unterkalmsteiner, Tony Gorschek, Robert Feldt, Niklas Lavesson
Abstract summary: We describe an engineering task, test case selection, and illustrate our problem analysis and solution discovery process. We analyze, in the context of the studied company, how test case selection is performed and design a series of experiments evaluating the performance of different IR techniques.
Score: 9.62054859086279
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Software Engineering activities are information intensive. Research proposes Information Retrieval (IR) techniques to support engineers in their daily tasks, such as establishing and maintaining traceability links, fault identification, and software maintenance. We describe an engineering task, test case selection, and illustrate our problem analysis and solution discovery process. The objective of the study is to gain an understanding of to what extent IR techniques (one potential solution) can be applied to test case selection and provide decision support in a large-scale, industrial setting. We analyze, in the context of the studied company, how test case selection is performed and design a series of experiments evaluating the performance of different IR techniques. Each experiment provides lessons learned from implementation, execution, and results, feeding to its successor. The three experiments led to the following observations: 1) there is a lack of research on scalable parameter optimization of IR techniques for software engineering problems; 2) scaling IR techniques to industry data is challenging, in particular for latent semantic analysis; 3) the IR context poses constraints on the empirical evaluation of IR techniques, requiring more research on developing valid statistical approaches. We believe that our experiences in conducting a series of IR experiments with industry grade data are valuable for peer researchers so that they can avoid the pitfalls that we have encountered. Furthermore, we identified challenges that need to be addressed in order to bridge the gap between laboratory IR experiments and real applications of IR in the industry.

Related papers

Let the Barbarians In: How AI Can Accelerate Systems Performance Research [80.43506848683633]
We term this iterative cycle of generation, evaluation, and refinement AI-Driven Research for Systems.<n>We demonstrate that ADRS-generated solutions can match or even outperform human state-of-the-art designs.
arXiv Detail & Related papers (2025-12-16T18:51:23Z)
Peer Code Review in Research Software Development: The Research Software Engineer Perspective [0.6385006149689549]
While peer code review can improve software quality, its adoption by research software engineers (RSEs) remains unexplored.<n>This study explores RSE perspectives on peer code review, focusing on their practices, challenges, and potential improvements.
arXiv Detail & Related papers (2025-11-13T20:07:10Z)
Triage in Software Engineering: A Systematic Review of Research and Practice [18.03124877437556]
Triage aims to efficiently prioritize, assign, and assess issues to ensure the reliability of complex environments.<n>The vast amount of heterogeneous data generated by software systems has made effective triage indispensable.<n>This survey provides a comprehensive review of 234 papers from 2004 to the present, offering an in-depth examination of the fundamental concepts, system architecture, and problem statement.
arXiv Detail & Related papers (2025-11-05T02:42:26Z)
A Systematic Literature Review on Detecting Software Vulnerabilities with Large Language Models [2.518519330408713]
Large Language Models (LLMs) in software engineering have sparked interest in their use for software vulnerability detection.<n>The rapid development of this field has resulted in a fragmented research landscape.<n>This fragmentation makes it difficult to obtain a clear overview of the state-of-the-art or compare and categorize studies meaningfully.
arXiv Detail & Related papers (2025-07-30T13:17:16Z)
Does the Tool Matter? Exploring Some Causes of Threats to Validity in Mining Software Repositories [9.539825294372786]
We use two tools to extract and analyse ten large software projects. Despite similar trends, even simple metrics such as the numbers of commits and developers may differ by up to 500%. We find that such substantial differences are often caused by minor technical details.
arXiv Detail & Related papers (2025-01-25T07:42:56Z)
A Call for Critically Rethinking and Reforming Data Analysis in Empirical Software Engineering [5.687882380471718]
Concerns about the correct application of empirical methodologies have existed since the 2006 Dagstuhl seminar on Empirical Software Engineering. We conducted a literature survey of 27,000 empirical studies, using LLMs to classify statistical methodologies as adequate or inadequate. We selected 30 primary studies and held a workshop with 33 ESE experts to assess their ability to identify and resolve statistical issues.
arXiv Detail & Related papers (2025-01-22T09:05:01Z)
On the Impact of 3D Visualization of Repository Metrics in Software Engineering Education [8.599324408542905]
This study aims to explore how VR-based repository metrics visualization can support the teaching of process comprehension. By immersing students in an intuitive environment, this research hypothesizes that VR can foster essential analytical skills.
arXiv Detail & Related papers (2024-12-20T17:06:15Z)
Introducing CausalBench: A Flexible Benchmark Framework for Causal Analysis and Machine Learning [10.686245134005047]
Causal learning aims to go far beyond conventional machine learning, yet several major challenges remain. We introduce em CausalBench, a transparent, fair, and easy-to-use evaluation platform.
arXiv Detail & Related papers (2024-09-12T22:45:10Z)
How Mature is Requirements Engineering for AI-based Systems? A Systematic Mapping Study on Practices, Challenges, and Future Research Directions [5.6818729232602205]
It is unclear if existing RE methods are sufficient or if new ones are needed to address these challenges. Existing RE4AI research focuses mainly on requirements analysis and elicitation, with most practices applied in these areas. We identified requirements specification, explainability, and the gap between machine learning engineers and end-users as the most prevalent challenges.
arXiv Detail & Related papers (2024-09-11T11:28:16Z)
DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents [49.74065769505137]
We introduce DISCOVERYWORLD, the first virtual environment for developing and benchmarking an agent's ability to perform complete cycles of novel scientific discovery. It includes 120 different challenge tasks spanning eight topics each with three levels of difficulty and several parametric variations. We find that strong baseline agents, that perform well in prior published environments, struggle on most DISCOVERYWORLD tasks.
arXiv Detail & Related papers (2024-06-10T20:08:44Z)
A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods. The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics. We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z)
The Causal Chambers: Real Physical Systems as a Testbed for AI Methodology [10.81691411087626]
In some fields of AI, machine learning and statistics, the validation of new methods and algorithms is often hindered by the scarcity of suitable real-world datasets. We have constructed two devices that allow us to quickly and inexpensively produce large datasets from non-trivial but well-understood physical systems.
arXiv Detail & Related papers (2024-04-17T13:00:52Z)
A Systematic Literature Review on Explainability for Machine/Deep Learning-based Software Engineering Research [23.966640472958105]
This paper presents a systematic literature review of approaches that aim to improve the explainability of AI models within the context of Software Engineering. We aim to summarize the SE tasks where XAI techniques have shown success to date; (2) classify and analyze different XAI techniques; and (3) investigate existing evaluation approaches.
arXiv Detail & Related papers (2024-01-26T03:20:40Z)
Information Retrieval Meets Large Language Models: A Strategic Report from Chinese IR Community [180.28262433004113]
Large Language Models (LLMs) have demonstrated exceptional capabilities in text understanding, generation, and knowledge inference. LLMs and humans form a new technical paradigm that is more powerful for information seeking. To thoroughly discuss the transformative impact of LLMs on IR research, the Chinese IR community conducted a strategic workshop in April 2023.
arXiv Detail & Related papers (2023-07-19T05:23:43Z)
Pitfalls in Experiments with DNN4SE: An Analysis of the State of the Practice [0.7614628596146599]
We conduct a mapping study, examining 194 experiments with techniques that rely on deep neural networks appearing in 55 papers published in premier software engineering venues. Our study reveals that most of the experiments, including those that have received ACM artifact badges, have fundamental limitations that raise doubts about the reliability of their findings.
arXiv Detail & Related papers (2023-05-19T09:55:48Z)
Wizard of Search Engine: Access to Information Through Conversations with Search Engines [58.53420685514819]
We make efforts to facilitate research on CIS from three aspects. We formulate a pipeline for CIS with six sub-tasks: intent detection (ID), keyphrase extraction (KE), action prediction (AP), query selection (QS), passage selection (PS) and response generation (RG) We release a benchmark dataset, called wizard of search engine (WISE), which allows for comprehensive and in-depth research on all aspects of CIS.
arXiv Detail & Related papers (2021-05-18T06:35:36Z)
Artificial Intelligence for IT Operations (AIOPS) Workshop White Paper [50.25428141435537]
Artificial Intelligence for IT Operations (AIOps) is an emerging interdisciplinary field arising in the intersection between machine learning, big data, streaming analytics, and the management of IT operations. Main aim of the AIOPS workshop is to bring together researchers from both academia and industry to present their experiences, results, and work in progress in this field.
arXiv Detail & Related papers (2021-01-15T10:43:10Z)
AutoOD: Automated Outlier Detection via Curiosity-guided Search and Self-imitation Learning [72.99415402575886]
Outlier detection is an important data mining task with numerous practical applications. We propose AutoOD, an automated outlier detection framework, which aims to search for an optimal neural network model. Experimental results on various real-world benchmark datasets demonstrate that the deep model identified by AutoOD achieves the best performance.
arXiv Detail & Related papers (2020-06-19T18:57:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.