Related papers: Revisiting Sentiment Analysis for Software Engineering in the Era of Large Language Models

Revisiting Sentiment Analysis for Software Engineering in the Era of Large Language Models

URL: http://arxiv.org/abs/2310.11113v2
Date: Thu, 19 Oct 2023 13:16:38 GMT
Title: Revisiting Sentiment Analysis for Software Engineering in the Era of Large Language Models
Authors: Ting Zhang and Ivana Clairine Irsan and Ferdian Thung and David Lo
Abstract summary: We study the performance of three open-source bLLMs in both zero-shot and few-shot scenarios. Our experimental findings demonstrate that bLLMs exhibit state-of-the-art performance on datasets marked by limited training data and imbalanced distributions.
Score: 12.440597259254286
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Software development is an inherently collaborative process, where various stakeholders frequently express their opinions and emotions across diverse platforms. Recognizing the sentiments conveyed in these interactions is crucial for the effective development and ongoing maintenance of software systems. Over the years, many tools have been proposed to aid in sentiment analysis, but accurately identifying the sentiments expressed in software engineering datasets remains challenging. Although fine-tuned smaller large language models (sLLMs) have shown potential in handling software engineering tasks, they struggle with the shortage of labeled data. With the emergence of bigger large language models (bLLMs), it is pertinent to investigate whether they can handle this challenge in the context of sentiment analysis for software engineering. In this work, we undertake a comprehensive empirical study using five established datasets. We assess the performance of three open-source bLLMs in both zero-shot and few-shot scenarios. Additionally, we compare them with fine-tuned sLLMs. Our experimental findings demonstrate that bLLMs exhibit state-of-the-art performance on datasets marked by limited training data and imbalanced distributions. bLLMs can also achieve excellent performance under a zero-shot setting. However, when ample training data is available or the dataset exhibits a more balanced distribution, fine-tuned sLLMs can still achieve superior results.

Related papers

Unveiling the Impact of Coding Data Instruction Fine-Tuning on Large Language Models Reasoning [64.5243480989869]
Instruction Fine-Tuning (IFT) significantly enhances the zero-shot capabilities of pretrained Large Language Models (LLMs) This paper investigates how coding data impact LLMs' reasoning capacities during the IFT stage.
arXiv Detail & Related papers (2024-05-30T23:20:25Z)
COLT: Towards Completeness-Oriented Tool Retrieval for Large Language Models [60.733557487886635]
We propose a novel modelagnostic COllaborative Learning-based Tool Retrieval approach, COLT. COLT captures semantic similarities between user queries and tool descriptions. It also takes into account the collaborative information of tools.
arXiv Detail & Related papers (2024-05-25T06:41:23Z)
AI Competitions and Benchmarks: Dataset Development [42.164845505628506]
This chapter provides a comprehensive overview of established methodological tools, enriched by our practical experience. We develop the tasks involved in dataset development and offer insights into their effective management. Then, we provide more details about the implementation process which includes data collection, transformation, and quality evaluation.
arXiv Detail & Related papers (2024-04-15T12:01:42Z)
DataAgent: Evaluating Large Language Models' Ability to Answer Zero-Shot, Natural Language Queries [0.0]
We evaluate OpenAI's GPT-3.5 as a "Language Data Scientist" (LDS) The model was tested on a diverse set of benchmark datasets to evaluate its performance across multiple standards.
arXiv Detail & Related papers (2024-03-29T22:59:34Z)
Data Interpreter: An LLM Agent For Data Science [43.99482533437711]
The Data Interpreter is a solution designed to solve with code. It emphasizes three pivotal techniques to augment problem-solving in data science. It showed a 26% increase in the MATH dataset and a remarkable 112% improvement in open-ended tasks.
arXiv Detail & Related papers (2024-02-28T19:49:55Z)
LESS: Selecting Influential Data for Targeted Instruction Tuning [64.78894228923619]
We propose LESS, an efficient algorithm to estimate data influences and perform Low-rank gradiEnt Similarity Search for instruction data selection. We show that training on a LESS-selected 5% of the data can often outperform training on the full dataset across diverse downstream tasks. Our method goes beyond surface form cues to identify data that the necessary reasoning skills for the intended downstream application.
arXiv Detail & Related papers (2024-02-06T19:18:04Z)
Modelling Concurrency Bugs Using Machine Learning [0.0]
This project aims to compare both common and recent machine learning approaches. We define a synthetic dataset that we generate with the scope of simulating real-life (concurrent) programs. We formulate hypotheses about fundamental limits of various machine learning model types.
arXiv Detail & Related papers (2023-05-08T17:30:24Z)
TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets. We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z)
Latent Variable Method Demonstrator -- Software for Understanding Multivariate Data Analytics Algorithms [0.0]
This article describes interactive software - the Latent Variable Demonstrator (LAVADE) - for teaching, learning, and understanding latent variable methods. Users can interactively compare latent variable methods such as Partial Least Squares (PLS), and Principal Component Regression (PCR) with other regression methods. The software contains a data generation method and three chemical process datasets, allowing for comparing results of datasets with different levels of complexity.
arXiv Detail & Related papers (2022-05-17T07:02:41Z)
SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines. This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z)
Competency Problems: On Finding and Removing Artifacts in Language Data [50.09608320112584]
We argue that for complex language understanding tasks, all simple feature correlations are spurious. We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account.
arXiv Detail & Related papers (2021-04-17T21:34:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.