Related papers: Demystifying Issues, Causes and Solutions in LLM Open-Source Projects

Demystifying Issues, Causes and Solutions in LLM Open-Source Projects

URL: http://arxiv.org/abs/2409.16559v1
Date: Wed, 25 Sep 2024 02:16:45 GMT
Title: Demystifying Issues, Causes and Solutions in LLM Open-Source Projects
Authors: Yangxiao Cai, Peng Liang, Yifei Wang, Zengyang Li, Mojtaba Shahin,
Abstract summary: We conducted an empirical study to understand the issues that practitioners encounter when developing and using LLM open-source software. We collected all closed issues from 15 LLM open-source projects and labelled issues that met our requirements. Our study results show that Model Issue is the most common issue faced by practitioners.
Score: 15.881912703104376
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With the advancements of Large Language Models (LLMs), an increasing number of open-source software projects are using LLMs as their core functional component. Although research and practice on LLMs are capturing considerable interest, no dedicated studies explored the challenges faced by practitioners of LLM open-source projects, the causes of these challenges, and potential solutions. To fill this research gap, we conducted an empirical study to understand the issues that practitioners encounter when developing and using LLM open-source software, the possible causes of these issues, and potential solutions.We collected all closed issues from 15 LLM open-source projects and labelled issues that met our requirements. We then randomly selected 994 issues from the labelled issues as the sample for data extraction and analysis to understand the prevalent issues, their underlying causes, and potential solutions. Our study results show that (1) Model Issue is the most common issue faced by practitioners, (2) Model Problem, Configuration and Connection Problem, and Feature and Method Problem are identified as the most frequent causes of the issues, and (3) Optimize Model is the predominant solution to the issues. Based on the study results, we provide implications for practitioners and researchers of LLM open-source projects.

Related papers

ReliableMath: Benchmark of Reliable Mathematical Reasoning on Large Language Models [70.33764118171463]
Large Language Models (LLMs) tend to fabricate unreliable responses when confronted with problems that are unsolvable or beyond their capability.<n>We develop a ReliableMath dataset which incorporates open-source solvable problems and high-quality unsolvable problems.<n>LLMs fail to directly identify unsolvable problems and always generate fabricated responses.
arXiv Detail & Related papers (2025-07-03T19:19:44Z)
An Empirical Study on Challenges for LLM Application Developers [28.69628251749012]
We crawl and analyze 29,057 relevant questions from a popular OpenAI developer forum. After manually analyzing 2,364 sampled questions, we construct a taxonomy of challenges faced by LLM developers.
arXiv Detail & Related papers (2024-08-06T05:46:28Z)
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data [20.31528845718877]
Large language models (LLMs) have significantly advanced natural language understanding and demonstrated strong problem-solving abilities. This paper investigates the mathematical problem-solving capabilities of LLMs using the newly developed "MathOdyssey" dataset.
arXiv Detail & Related papers (2024-06-26T13:02:35Z)
Can LLMs Solve longer Math Word Problems Better? [47.227621867242]
Math Word Problems (MWPs) are crucial for evaluating the capability of Large Language Models (LLMs) This study pioneers the exploration of Context Length Generalizability (CoLeG) Two novel metrics are proposed to assess the efficacy and resilience of LLMs in solving these problems.
arXiv Detail & Related papers (2024-05-23T17:13:50Z)
Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs [60.40396361115776]
This paper introduces a novel collaborative approach, namely SlimPLM, that detects missing knowledge in large language models (LLMs) with a slim proxy model. We employ a proxy model which has far fewer parameters, and take its answers as answers. Heuristic answers are then utilized to predict the knowledge required to answer the user question, as well as the known and unknown knowledge within the LLM.
arXiv Detail & Related papers (2024-02-19T11:11:08Z)
Evaluating LLMs' Mathematical and Coding Competency through Ontology-guided Interventions [47.83142414018448]
We focus on two popular reasoning tasks: arithmetic reasoning and code generation. We introduce (i) a general ontology of perturbations for math and coding questions, (ii) a semi-automatic method to apply these perturbations, and (iii) two datasets. We show a significant performance drop across all the models against perturbed questions.
arXiv Detail & Related papers (2024-01-17T18:13:07Z)
Breaking the Silence: the Threats of Using LLMs in Software Engineering [12.368546216271382]
Large Language Models (LLMs) have gained considerable traction within the Software Engineering (SE) community. This paper initiates an open discussion on potential threats to the validity of LLM-based research.
arXiv Detail & Related papers (2023-12-13T11:02:19Z)
Competition-Level Problems are Effective LLM Evaluators [121.15880285283116]
This paper aims to evaluate the reasoning capacities of large language models (LLMs) in solving recent programming problems in Codeforces. We first provide a comprehensive evaluation of GPT-4's peiceived zero-shot performance on this task, considering various aspects such as problems' release time, difficulties, and types of errors encountered. Surprisingly, theThoughtived performance of GPT-4 has experienced a cliff like decline in problems after September 2021 consistently across all the difficulties and types of problems.
arXiv Detail & Related papers (2023-12-04T18:58:57Z)
Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity [61.54815512469125]
This survey addresses the crucial issue of factuality in Large Language Models (LLMs) As LLMs find applications across diverse domains, the reliability and accuracy of their outputs become vital.
arXiv Detail & Related papers (2023-10-11T14:18:03Z)
SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models [70.5763210869525]
We introduce an expansive benchmark suite SciBench for Large Language Model (LLM) SciBench contains a dataset featuring a range of collegiate-level scientific problems from mathematics, chemistry, and physics domains. The results reveal that the current LLMs fall short of delivering satisfactory performance, with the best overall score of merely 43.22%.
arXiv Detail & Related papers (2023-07-20T07:01:57Z)
On the Risk of Misinformation Pollution with Large Language Models [127.1107824751703]
We investigate the potential misuse of modern Large Language Models (LLMs) for generating credible-sounding misinformation. Our study reveals that LLMs can act as effective misinformation generators, leading to a significant degradation in the performance of Open-Domain Question Answering (ODQA) systems.
arXiv Detail & Related papers (2023-05-23T04:10:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.