Related papers: Apples, Oranges, and Software Engineering: Study Selection Challenges for Secondary Research on Latent Variables

Apples, Oranges, and Software Engineering: Study Selection Challenges for Secondary Research on Latent Variables

URL: http://arxiv.org/abs/2402.08706v1
Date: Tue, 13 Feb 2024 17:32:17 GMT
Title: Apples, Oranges, and Software Engineering: Study Selection Challenges for Secondary Research on Latent Variables
Authors: Marvin Wyrich and Marvin Mu\~noz Bar\'on and Justus Bogner
Abstract summary: The inability to measure abstract concepts directly poses a challenge for secondary studies in software engineering. Standardized measurement instruments are rarely available, and even if they are, many researchers do not use them or do not even provide a definition for the studied concept. SE researchers conducting secondary studies therefore have to decide a) which primary studies intended to measure the same construct, and b) how to compare and aggregate vastly different measurements for the same construct.
Score: 8.612556181934291
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Software engineering (SE) is full of abstract concepts that are crucial for both researchers and practitioners, such as programming experience, team productivity, code comprehension, and system security. Secondary studies aimed at summarizing research on the influences and consequences of such concepts would therefore be of great value. However, the inability to measure abstract concepts directly poses a challenge for secondary studies: primary studies in SE can operationalize such concepts in many ways. Standardized measurement instruments are rarely available, and even if they are, many researchers do not use them or do not even provide a definition for the studied concept. SE researchers conducting secondary studies therefore have to decide a) which primary studies intended to measure the same construct, and b) how to compare and aggregate vastly different measurements for the same construct. In this experience report, we discuss the challenge of study selection in SE secondary research on latent variables. We report on two instances where we found it particularly challenging to decide which primary studies should be included for comparison and synthesis, so as not to end up comparing apples with oranges. Our report aims to spark a conversation about developing strategies to address this issue systematically and pave the way for more efficient and rigorous secondary studies in software engineering.

Related papers

Are Information Retrieval Approaches Good at Harmonising Longitudinal Survey Questions in Social Science? [2.769064123193329]
We present a new information retrieval task to identify concept equivalence across question and response options. This paper investigates multiple unsupervised approaches on a survey dataset spanning 1946-2020. We show that IR-specialised neural models achieve the highest overall performance with other approaches performing comparably.
arXiv Detail & Related papers (2025-04-29T12:00:33Z)
Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation [58.064940977804596]
A plethora of new AI models and tools has been proposed, promising to empower researchers and academics worldwide to conduct their research more effectively and efficiently. Ethical concerns regarding shortcomings of these tools and potential for misuse take a particularly prominent place in our discussion.
arXiv Detail & Related papers (2025-02-07T18:26:45Z)
Enriching Social Science Research via Survey Item Linking [11.902701975866595]
We model a task called Survey Item Linking (SIL) in two stages: mention detection and entity disambiguation. To this end, we create a high-quality and richly annotated dataset consisting of 20,454 English and German sentences. We demonstrate that the task is feasible, but observe that errors propagate from the first stage, leading to a lower overall task performance.
arXiv Detail & Related papers (2024-12-20T12:14:33Z)
Software analytics for software engineering: A tertiary review [2.7386485828693576]
We identify five secondary studies on the use of software analytics (SA) for software engineering (SE) Despite the overlapping objectives and search time frames of these secondary studies, there is negligible overlap of primary studies between these secondary studies. We conclude that an overview of the literature identified by these secondary studies would be useful in providing a more comprehensive overview of the topic.
arXiv Detail & Related papers (2024-10-08T08:28:03Z)
Teaching Software Metrology: The Science of Measurement for Software Engineering [10.23712090082156]
This chapter reviews key concepts in the science of measurement and applies them to software engineering research. A series of exercises for applying important measurement concepts to the reader's research are included.
arXiv Detail & Related papers (2024-06-20T16:57:23Z)
DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents [49.74065769505137]
We introduce DISCOVERYWORLD, the first virtual environment for developing and benchmarking an agent's ability to perform complete cycles of novel scientific discovery. It includes 120 different challenge tasks spanning eight topics each with three levels of difficulty and several parametric variations. We find that strong baseline agents, that perform well in prior published environments, struggle on most DISCOVERYWORLD tasks.
arXiv Detail & Related papers (2024-06-10T20:08:44Z)
Making Software Development More Diverse and Inclusive: Key Themes, Challenges, and Future Directions [50.545824691484796]
We identify six themes around the theme challenges and opportunities to improve Software Developer Diversity and Inclusion (SDDI) We identify benefits, harms, and future research directions for the four main themes. We discuss the remaining two themes, Artificial Intelligence & SDDI and AI & Computer Science education, which have a cross-cutting effect on the other themes.
arXiv Detail & Related papers (2024-04-10T16:18:11Z)
Twin Papers: A Simple Framework of Causal Inference for Citations via Coupling [40.60905158071766]
The main difficulty in investigating the effects is that we need to know counterfactual results, which are not available in reality. The proposed framework regards a pair of papers that cite each other as twins. We investigate twin papers that adopted different decisions, observe the progress of the research impact brought by these studies, and estimate the effect of decisions by the difference.
arXiv Detail & Related papers (2022-08-21T10:42:33Z)
Active Exploration via Experiment Design in Markov Chains [86.41407938210193]
A key challenge in science and engineering is to design experiments to learn about some unknown quantity of interest. We propose an algorithm that efficiently selects policies whose measurement allocation converges to the optimal one. In addition to our theoretical analysis, we showcase our framework on applications in ecological surveillance and pharmacology.
arXiv Detail & Related papers (2022-06-29T00:04:40Z)
Improving Students' Academic Performance with AI and Semantic Technologies [0.0]
The aim of this study is to predict students' performance using marks from the previous semester, to model a course representation in a semantic way, and to identify the prerequisite between two similar courses. The outcomes of this study can be summarized as: (i) a breakthrough result improves Manrique's work by 2.5% in terms of accuracy in dropout prediction; (ii) uncover the similarity between courses based on course description; (iii) identify the prerequisite over three compulsory courses of School of Computing at ANU.
arXiv Detail & Related papers (2022-05-02T06:11:24Z)
Wizard of Search Engine: Access to Information Through Conversations with Search Engines [58.53420685514819]
We make efforts to facilitate research on CIS from three aspects. We formulate a pipeline for CIS with six sub-tasks: intent detection (ID), keyphrase extraction (KE), action prediction (AP), query selection (QS), passage selection (PS) and response generation (RG) We release a benchmark dataset, called wizard of search engine (WISE), which allows for comprehensive and in-depth research on all aspects of CIS.
arXiv Detail & Related papers (2021-05-18T06:35:36Z)
AR-LSAT: Investigating Analytical Reasoning of Text [57.1542673852013]
We study the challenge of analytical reasoning of text and introduce a new dataset consisting of questions from the Law School Admission Test from 1991 to 2016. We analyze what knowledge understanding and reasoning abilities are required to do well on this task.
arXiv Detail & Related papers (2021-04-14T02:53:32Z)
Phase Transition Behavior in Knowledge Compilation [52.68422776053012]
We study the behaviour of size and compile-time behaviour for random k-CNF formulas in the context of knowledge compilation. Our work is similar in spirit to the early work in CSP community on phase transition behavior in SAT/CSP.
arXiv Detail & Related papers (2020-07-20T18:36:27Z)
Secondary Studies in the Academic Context: A Systematic Mapping and Survey [4.122293798697967]
The main goal of this study is to provide an overview on the use of secondary studies in an academic context. We conducted an SM to identify the available and relevant studies on the use of secondary studies as a research methodology for conducting SE research projects. Secondly, a survey was performed with 64 SE researchers to identify their perception related to the value of performing secondary studies to support their research projects.
arXiv Detail & Related papers (2020-07-10T20:01:26Z)
A Survey on Causal Inference [64.45536158710014]
Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy and economics. Various causal effect estimation methods for observational data have sprung up.
arXiv Detail & Related papers (2020-02-05T21:35:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.