Related papers: Uncovering the Causes of Emotions in Software Developer Communication Using Zero-shot LLMs

Uncovering the Causes of Emotions in Software Developer Communication Using Zero-shot LLMs

URL: http://arxiv.org/abs/2312.09731v1
Date: Fri, 15 Dec 2023 12:16:16 GMT
Title: Uncovering the Causes of Emotions in Software Developer Communication Using Zero-shot LLMs
Authors: Mia Mohammad Imran, Preetha Chatterjee, Kostadin Damevski
Abstract summary: Large-scale software engineering-specific datasets that can be used to train accurate machine learning models are required. This paper explores zero-shot LLMs that are pre-trained on massive datasets but without being fine-tuned specifically for the task of detecting emotion causes in software engineering.
Score: 9.298552727430485
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Understanding and identifying the causes behind developers' emotions (e.g., Frustration caused by `delays in merging pull requests') can be crucial towards finding solutions to problems and fostering collaboration in open-source communities. Effectively identifying such information in the high volume of communications across the different project channels, such as chats, emails, and issue comments, requires automated recognition of emotions and their causes. To enable this automation, large-scale software engineering-specific datasets that can be used to train accurate machine learning models are required. However, such datasets are expensive to create with the variety and informal nature of software projects' communication channels. In this paper, we explore zero-shot LLMs that are pre-trained on massive datasets but without being fine-tuned specifically for the task of detecting emotion causes in software engineering: ChatGPT, GPT-4, and flan-alpaca. Our evaluation indicates that these recently available models can identify emotion categories when given detailed emotions, although they perform worse than the top-rated models. For emotion cause identification, our results indicate that zero-shot LLMs are effective at recognizing the correct emotion cause with a BLEU-2 score of 0.598. To highlight the potential use of these techniques, we conduct a case study of the causes of Frustration in the last year of development of a popular open-source project, revealing several interesting insights.

Related papers

Emo Pillars: Knowledge Distillation to Support Fine-Grained Context-Aware and Context-Less Emotion Classification [56.974545305472304]
Most datasets for sentiment analysis lack context in which an opinion was expressed, often crucial for emotion understanding, and are mainly limited by a few emotion categories. We design an LLM-based data synthesis pipeline and leverage a large model, Mistral-7b, for the generation of training examples for more accessible, lightweight BERT-type encoder models. We show that Emo Pillars models are highly adaptive to new domains when tuned to specific tasks such as GoEmotions, ISEAR, IEMOCAP, and EmoContext, reaching the SOTA performance on the first three.
arXiv Detail & Related papers (2025-04-23T16:23:17Z)
Emotional Strain and Frustration in LLM Interactions in Software Engineering [0.0]
Large Language Models (LLMs) are increasingly integrated into various daily tasks in Software Engineering. Frustration can negatively impact engineers' productivity and well-being if they escalate into stress and burnout.
arXiv Detail & Related papers (2025-04-14T09:55:47Z)
Emotion Detection in Reddit: Comparative Study of Machine Learning and Deep Learning Techniques [0.0]
This study concentrates on text-based emotion detection by leveraging the GoEmotions dataset. We employed a range of models for this task, including six machine learning models, three ensemble models, and a Long Short-Term Memory (LSTM) model. Results indicate that the Stacking classifier outperforms other models in accuracy and performance.
arXiv Detail & Related papers (2024-11-15T16:28:25Z)
AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models [95.09157454599605]
Large Language Models (LLMs) are becoming increasingly powerful, but they still exhibit significant but subtle weaknesses. Traditional benchmarking approaches cannot thoroughly pinpoint specific model deficiencies. We introduce a unified framework, AutoDetect, to automatically expose weaknesses in LLMs across various tasks.
arXiv Detail & Related papers (2024-06-24T15:16:45Z)
Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning [55.127202990679976]
We introduce the MERR dataset, containing 28,618 coarse-grained and 4,487 fine-grained annotated samples across diverse emotional categories. This dataset enables models to learn from varied scenarios and generalize to real-world applications. We propose Emotion-LLaMA, a model that seamlessly integrates audio, visual, and textual inputs through emotion-specific encoders.
arXiv Detail & Related papers (2024-06-17T03:01:22Z)
Modeling User Preferences via Brain-Computer Interfacing [54.3727087164445]
We use Brain-Computer Interfacing technology to infer users' preferences, their attentional correlates towards visual content, and their associations with affective experience. We link these to relevant applications, such as information retrieval, personalized steering of generative models, and crowdsourcing population estimates of affective experiences.
arXiv Detail & Related papers (2024-05-15T20:41:46Z)
GPT as Psychologist? Preliminary Evaluations for GPT-4V on Visual Affective Computing [74.68232970965595]
Multimodal large language models (MLLMs) are designed to process and integrate information from multiple sources, such as text, speech, images, and videos. This paper assesses the application of MLLMs with 5 crucial abilities for affective computing, spanning from visual affective tasks and reasoning tasks.
arXiv Detail & Related papers (2024-03-09T13:56:25Z)
EmoBench: Evaluating the Emotional Intelligence of Large Language Models [73.60839120040887]
EmoBench is a benchmark that draws upon established psychological theories and proposes a comprehensive definition for machine Emotional Intelligence (EI) EmoBench includes a set of 400 hand-crafted questions in English and Chinese, which are meticulously designed to require thorough reasoning and understanding. Our findings reveal a considerable gap between the EI of existing Large Language Models and the average human, highlighting a promising direction for future research.
arXiv Detail & Related papers (2024-02-19T11:48:09Z)
Towards Understanding Emotions in Informal Developer Interactions: A Gitter Chat Study [10.372820248341746]
We present a dataset of developer chat messages manually annotated with a wide range of emotion labels (and sub-labels) We investigate the unique signals of emotions specific to chats and distinguish them from other forms of software communication. Our findings suggest that chats have fewer expressions of Approval and Fear but more expressions of Curiosity compared to GitHub comments.
arXiv Detail & Related papers (2023-11-08T15:29:33Z)
Revisiting Sentiment Analysis for Software Engineering in the Era of Large Language Models [11.388023221294686]
This study investigates bigger large language models (bLLMs) in addressing the labeled data shortage that hampers fine-tuned smaller large language models (sLLMs) in software engineering tasks. We conduct a comprehensive empirical study using five established datasets to assess three open-source bLLMs in zero-shot and few-shot scenarios. Our experimental findings demonstrate that bLLMs exhibit state-of-the-art performance on datasets marked by limited training data and imbalanced distributions.
arXiv Detail & Related papers (2023-10-17T09:53:03Z)
Implicit Design Choices and Their Impact on Emotion Recognition Model Development and Evaluation [5.534160116442057]
The subjectivity of emotions poses significant challenges in developing accurate and robust computational models. This thesis examines critical facets of emotion recognition, beginning with the collection of diverse datasets. To handle the challenge of non-representative training data, this work collects the Multimodal Stressed Emotion dataset.
arXiv Detail & Related papers (2023-09-06T02:45:42Z)
Emotionally Numb or Empathetic? Evaluating How LLMs Feel Using EmotionBench [83.41621219298489]
We evaluate Large Language Models' (LLMs) anthropomorphic capabilities using the emotion appraisal theory from psychology. We collect a dataset containing over 400 situations that have proven effective in eliciting the eight emotions central to our study. We conduct a human evaluation involving more than 1,200 subjects worldwide.
arXiv Detail & Related papers (2023-08-07T15:18:30Z)
LDEB -- Label Digitization with Emotion Binarization and Machine Learning for Emotion Recognition in Conversational Dialogues [0.0]
Emotion recognition in conversations (ERC) is vital to the advancements of conversational AI and its applications. The conversational dialogues present a unique problem where each dialogue depicts nested emotions that entangle the association between the emotional feature descriptors and emotion type (or label) We proposed a novel approach called Label Digitization with Emotion Binarization (LDEB) that disentangles the twists by utilizing the text normalization and 7-bit digital encoding techniques.
arXiv Detail & Related papers (2023-06-03T20:37:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.