Large-scale information retrieval in software engineering -- an
experience report from industrial application
- URL: http://arxiv.org/abs/2308.11750v1
- Date: Tue, 22 Aug 2023 19:30:56 GMT
- Title: Large-scale information retrieval in software engineering -- an
experience report from industrial application
- Authors: Michael Unterkalmsteiner, Tony Gorschek, Robert Feldt, Niklas Lavesson
- Abstract summary: We describe an engineering task, test case selection, and illustrate our problem analysis and solution discovery process.
We analyze, in the context of the studied company, how test case selection is performed and design a series of experiments evaluating the performance of different IR techniques.
- Score: 9.62054859086279
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Software Engineering activities are information intensive. Research proposes
Information Retrieval (IR) techniques to support engineers in their daily
tasks, such as establishing and maintaining traceability links, fault
identification, and software maintenance. We describe an engineering task, test
case selection, and illustrate our problem analysis and solution discovery
process. The objective of the study is to gain an understanding of to what
extent IR techniques (one potential solution) can be applied to test case
selection and provide decision support in a large-scale, industrial setting. We
analyze, in the context of the studied company, how test case selection is
performed and design a series of experiments evaluating the performance of
different IR techniques. Each experiment provides lessons learned from
implementation, execution, and results, feeding to its successor. The three
experiments led to the following observations: 1) there is a lack of research
on scalable parameter optimization of IR techniques for software engineering
problems; 2) scaling IR techniques to industry data is challenging, in
particular for latent semantic analysis; 3) the IR context poses constraints on
the empirical evaluation of IR techniques, requiring more research on
developing valid statistical approaches. We believe that our experiences in
conducting a series of IR experiments with industry grade data are valuable for
peer researchers so that they can avoid the pitfalls that we have encountered.
Furthermore, we identified challenges that need to be addressed in order to
bridge the gap between laboratory IR experiments and real applications of IR in
the industry.
Related papers
- Introducing CausalBench: A Flexible Benchmark Framework for Causal Analysis and Machine Learning [10.686245134005047]
Causal learning aims to go far beyond conventional machine learning, yet several major challenges remain.
We introduce em CausalBench, a transparent, fair, and easy-to-use evaluation platform.
arXiv Detail & Related papers (2024-09-12T22:45:10Z) - How Mature is Requirements Engineering for AI-based Systems? A Systematic Mapping Study on Practices, Challenges, and Future Research Directions [5.6818729232602205]
It is unclear if existing RE methods are sufficient or if new ones are needed to address these challenges.
Existing RE4AI research focuses mainly on requirements analysis and elicitation, with most practices applied in these areas.
We identified requirements specification, explainability, and the gap between machine learning engineers and end-users as the most prevalent challenges.
arXiv Detail & Related papers (2024-09-11T11:28:16Z) - DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents [49.74065769505137]
We introduce DISCOVERYWORLD, the first virtual environment for developing and benchmarking an agent's ability to perform complete cycles of novel scientific discovery.
It includes 120 different challenge tasks spanning eight topics each with three levels of difficulty and several parametric variations.
We find that strong baseline agents, that perform well in prior published environments, struggle on most DISCOVERYWORLD tasks.
arXiv Detail & Related papers (2024-06-10T20:08:44Z) - A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - The Causal Chambers: Real Physical Systems as a Testbed for AI Methodology [10.81691411087626]
In some fields of AI, machine learning and statistics, the validation of new methods and algorithms is often hindered by the scarcity of suitable real-world datasets.
We have constructed two devices that allow us to quickly and inexpensively produce large datasets from non-trivial but well-understood physical systems.
arXiv Detail & Related papers (2024-04-17T13:00:52Z) - A Systematic Literature Review on Explainability for Machine/Deep
Learning-based Software Engineering Research [23.966640472958105]
This paper presents a systematic literature review of approaches that aim to improve the explainability of AI models within the context of Software Engineering.
We aim to summarize the SE tasks where XAI techniques have shown success to date; (2) classify and analyze different XAI techniques; and (3) investigate existing evaluation approaches.
arXiv Detail & Related papers (2024-01-26T03:20:40Z) - Information Retrieval Meets Large Language Models: A Strategic Report
from Chinese IR Community [180.28262433004113]
Large Language Models (LLMs) have demonstrated exceptional capabilities in text understanding, generation, and knowledge inference.
LLMs and humans form a new technical paradigm that is more powerful for information seeking.
To thoroughly discuss the transformative impact of LLMs on IR research, the Chinese IR community conducted a strategic workshop in April 2023.
arXiv Detail & Related papers (2023-07-19T05:23:43Z) - Pitfalls in Experiments with DNN4SE: An Analysis of the State of the
Practice [0.7614628596146599]
We conduct a mapping study, examining 194 experiments with techniques that rely on deep neural networks appearing in 55 papers published in premier software engineering venues.
Our study reveals that most of the experiments, including those that have received ACM artifact badges, have fundamental limitations that raise doubts about the reliability of their findings.
arXiv Detail & Related papers (2023-05-19T09:55:48Z) - Wizard of Search Engine: Access to Information Through Conversations
with Search Engines [58.53420685514819]
We make efforts to facilitate research on CIS from three aspects.
We formulate a pipeline for CIS with six sub-tasks: intent detection (ID), keyphrase extraction (KE), action prediction (AP), query selection (QS), passage selection (PS) and response generation (RG)
We release a benchmark dataset, called wizard of search engine (WISE), which allows for comprehensive and in-depth research on all aspects of CIS.
arXiv Detail & Related papers (2021-05-18T06:35:36Z) - Artificial Intelligence for IT Operations (AIOPS) Workshop White Paper [50.25428141435537]
Artificial Intelligence for IT Operations (AIOps) is an emerging interdisciplinary field arising in the intersection between machine learning, big data, streaming analytics, and the management of IT operations.
Main aim of the AIOPS workshop is to bring together researchers from both academia and industry to present their experiences, results, and work in progress in this field.
arXiv Detail & Related papers (2021-01-15T10:43:10Z) - AutoOD: Automated Outlier Detection via Curiosity-guided Search and
Self-imitation Learning [72.99415402575886]
Outlier detection is an important data mining task with numerous practical applications.
We propose AutoOD, an automated outlier detection framework, which aims to search for an optimal neural network model.
Experimental results on various real-world benchmark datasets demonstrate that the deep model identified by AutoOD achieves the best performance.
arXiv Detail & Related papers (2020-06-19T18:57:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.