Can AI Serve as a Substitute for Human Subjects in Software Engineering
Research?
- URL: http://arxiv.org/abs/2311.11081v1
- Date: Sat, 18 Nov 2023 14:05:52 GMT
- Title: Can AI Serve as a Substitute for Human Subjects in Software Engineering
Research?
- Authors: Marco A. Gerosa, Bianca Trinkenreich, Igor Steinmacher, Anita Sarma
- Abstract summary: This vision paper proposes a novel approach to qualitative data collection in software engineering research by harnessing the capabilities of artificial intelligence (AI)
We explore the potential of AI-generated synthetic text as an alternative source of qualitative data.
We discuss the prospective development of new foundation models aimed at emulating human behavior in observational studies and user evaluations.
- Score: 24.39463126056733
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Research within sociotechnical domains, such as Software Engineering,
fundamentally requires a thorough consideration of the human perspective.
However, traditional qualitative data collection methods suffer from challenges
related to scale, labor intensity, and the increasing difficulty of participant
recruitment. This vision paper proposes a novel approach to qualitative data
collection in software engineering research by harnessing the capabilities of
artificial intelligence (AI), especially large language models (LLMs) like
ChatGPT. We explore the potential of AI-generated synthetic text as an
alternative source of qualitative data, by discussing how LLMs can replicate
human responses and behaviors in research settings. We examine the application
of AI in automating data collection across various methodologies, including
persona-based prompting for interviews, multi-persona dialogue for focus
groups, and mega-persona responses for surveys. Additionally, we discuss the
prospective development of new foundation models aimed at emulating human
behavior in observational studies and user evaluations. By simulating human
interaction and feedback, these AI models could offer scalable and efficient
means of data generation, while providing insights into human attitudes,
experiences, and performance. We discuss several open problems and research
opportunities to implement this vision and conclude that while AI could augment
aspects of data gathering in software engineering research, it cannot replace
the nuanced, empathetic understanding inherent in human subjects in some cases,
and an integrated approach where both AI and human-generated data coexist will
likely yield the most effective outcomes.
Related papers
- Data Augmentation in Human-Centric Vision [54.97327269866757]
This survey presents a comprehensive analysis of data augmentation techniques in human-centric vision tasks.
It delves into a wide range of research areas including person ReID, human parsing, human pose estimation, and pedestrian detection.
Our work categorizes data augmentation methods into two main types: data generation and data perturbation.
arXiv Detail & Related papers (2024-03-13T16:05:18Z) - AI and Generative AI for Research Discovery and Summarization [3.8601741392210434]
AI and generative AI tools have burst onto the scene this year, creating incredible opportunities to increase work productivity and improve our lives.
One area that these tools can make a substantial impact is in research discovery and summarization.
We review the developments in AI and generative AI for research discovery and summarization, and propose directions where these types of tools are likely to head in the future.
arXiv Detail & Related papers (2024-01-08T18:42:55Z) - Generative AI in Writing Research Papers: A New Type of Algorithmic Bias
and Uncertainty in Scholarly Work [0.38850145898707145]
Large language models (LLMs) and generative AI tools present challenges in identifying and addressing biases.
generative AI tools are susceptible to goal misgeneralization, hallucinations, and adversarial attacks such as red teaming prompts.
We find that incorporating generative AI in the process of writing research manuscripts introduces a new type of context-induced algorithmic bias.
arXiv Detail & Related papers (2023-12-04T04:05:04Z) - Where Are We So Far? Understanding Data Storytelling Tools from the Perspective of Human-AI Collaboration [39.96202614397779]
Recent research has explored the potential for artificial intelligence to support and augment humans in data storytelling.
There lacks a systematic review to understand data storytelling tools from the perspective of human-AI collaboration.
This paper investigated existing tools with a framework from two perspectives: the stages in the storytelling workflow where a tool serves, including analysis, planning, implementation, and communication, and the roles of humans and AI.
arXiv Detail & Related papers (2023-09-27T15:30:50Z) - The Future of Fundamental Science Led by Generative Closed-Loop
Artificial Intelligence [67.70415658080121]
Recent advances in machine learning and AI are disrupting technological innovation, product development, and society as a whole.
AI has contributed less to fundamental science in part because large data sets of high-quality data for scientific practice and model discovery are more difficult to access.
Here we explore and investigate aspects of an AI-driven, automated, closed-loop approach to scientific discovery.
arXiv Detail & Related papers (2023-07-09T21:16:56Z) - The ethical ambiguity of AI data enrichment: Measuring gaps in research
ethics norms and practices [2.28438857884398]
This study explores how, and to what extent, comparable research ethics requirements and norms have developed for AI research and data enrichment.
Leading AI venues have begun to establish protocols for human data collection, but these are are inconsistently followed by authors.
arXiv Detail & Related papers (2023-06-01T16:12:55Z) - Human-Centric Multimodal Machine Learning: Recent Advances and Testbed
on AI-based Recruitment [66.91538273487379]
There is a certain consensus about the need to develop AI applications with a Human-Centric approach.
Human-Centric Machine Learning needs to be developed based on four main requirements: (i) utility and social good; (ii) privacy and data ownership; (iii) transparency and accountability; and (iv) fairness in AI-driven decision-making processes.
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
arXiv Detail & Related papers (2023-02-13T16:44:44Z) - Methodological reflections for AI alignment research using human
feedback [0.0]
AI alignment aims to investigate whether AI technologies align with human interests and values and function in a safe and ethical manner.
LLMs have the potential to exhibit unintended behavior due to their ability to learn and adapt in ways that are difficult to predict.
arXiv Detail & Related papers (2022-12-22T14:27:33Z) - DIME: Fine-grained Interpretations of Multimodal Models via Disentangled
Local Explanations [119.1953397679783]
We focus on advancing the state-of-the-art in interpreting multimodal models.
Our proposed approach, DIME, enables accurate and fine-grained analysis of multimodal models.
arXiv Detail & Related papers (2022-03-03T20:52:47Z) - Human-Robot Collaboration and Machine Learning: A Systematic Review of
Recent Research [69.48907856390834]
Human-robot collaboration (HRC) is the approach that explores the interaction between a human and a robot.
This paper proposes a thorough literature review of the use of machine learning techniques in the context of HRC.
arXiv Detail & Related papers (2021-10-14T15:14:33Z) - What Matters in Learning from Offline Human Demonstrations for Robot
Manipulation [64.43440450794495]
We conduct an extensive study of six offline learning algorithms for robot manipulation.
Our study analyzes the most critical challenges when learning from offline human data.
We highlight opportunities for learning from human datasets.
arXiv Detail & Related papers (2021-08-06T20:48:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.