Related papers: Insights from the Usage of the Ansible Lightspeed Code Completion Service

Insights from the Usage of the Ansible Lightspeed Code Completion Service

URL: http://arxiv.org/abs/2402.17442v4
Date: Tue, 22 Oct 2024 10:30:19 GMT
Title: Insights from the Usage of the Ansible Lightspeed Code Completion Service
Authors: Priyam Sahoo, Saurabh Pujar, Ganesh Nalawade, Richard Gebhardt, Louis Mandel, Luca Buratti,
Abstract summary: Lightspeed is an IT automation-specific language. Code for Lightspeed service and the analysis framework is made available for others to use. First code completion tool to present N-Day user retention figures.
Score: 2.6401871006820534
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The availability of Large Language Models (LLMs) which can generate code, has made it possible to create tools that improve developer productivity. Integrated development environments or IDEs which developers use to write software are often used as an interface to interact with LLMs. Although many such tools have been released, almost all of them focus on general-purpose programming languages. Domain-specific languages, such as those crucial for Information Technology (IT) automation, have not received much attention. Ansible is one such YAML-based IT automation-specific language. Ansible Lightspeed is an LLM-based service designed explicitly to generate Ansible YAML, given natural language prompt. In this paper, we present the design and implementation of the Ansible Lightspeed service. We then evaluate its utility to developers using diverse indicators, including extended utilization, analysis of user edited suggestions, as well as user sentiments analysis. The evaluation is based on data collected for 10,696 real users including 3,910 returning users. The code for Ansible Lightspeed service and the analysis framework is made available for others to use. To our knowledge, our study is the first to involve thousands of users of code assistants for domain-specific languages. We are also the first code completion tool to present N-Day user retention figures, which is 13.66% on Day 30. We propose an improved version of user acceptance rate, called Strong Acceptance rate, where a suggestion is considered accepted only if less than 50% of it is edited and these edits do not change critical parts of the suggestion. By focusing on Ansible, Lightspeed is able to achieve a strong acceptance rate of 49.08% for multi-line Ansible task suggestions. With our findings we provide insights into the effectiveness of small, dedicated models in a domain-specific context.

Related papers

debug-gym: A Text-Based Environment for Interactive Debugging [55.11603087371956]
Large Language Models (LLMs) are increasingly relied upon for coding tasks. We posit that LLMs can benefit from the ability to interactively explore a to gather the information relevant to their task. We present a textual environment, namely debug-gym, for developing LLM-based agents in an interactive coding setting.
arXiv Detail & Related papers (2025-03-27T14:43:28Z)
Learning to Ask: When LLMs Meet Unclear Instruction [49.256630152684764]
Large language models (LLMs) can leverage external tools for addressing a range of tasks unattainable through language skills alone. We evaluate the performance of LLMs tool-use under imperfect instructions, analyze the error patterns, and build a challenging tool-use benchmark called Noisy ToolBench. We propose a novel framework, Ask-when-Needed (AwN), which prompts LLMs to ask questions to users whenever they encounter obstacles due to unclear instructions.
arXiv Detail & Related papers (2024-08-31T23:06:12Z)
GTA: A Benchmark for General Tool Agents [32.443456248222695]
We design 229 real-world tasks and executable tool chains to evaluate mainstream large language models (LLMs) Our findings show that real-world user queries are challenging for existing LLMs, with GPT-4 completing less than 50% of the tasks and most LLMs achieving below 25%. This evaluation reveals the bottlenecks in the tool-use capabilities of current LLMs in real-world scenarios, which provides future direction for advancing general-purpose tool agents.
arXiv Detail & Related papers (2024-07-11T17:50:09Z)
Chain of Tools: Large Language Model is an Automatic Multi-tool Learner [54.992464510992605]
Automatic Tool Chain (ATC) is a framework that enables the large language models (LLMs) to act as a multi-tool user. To scale up the scope of the tools, we next propose a black-box probing method. For a comprehensive evaluation, we build a challenging benchmark named ToolFlow.
arXiv Detail & Related papers (2024-05-26T11:40:58Z)
Using Large Language Models for Commit Message Generation: A Preliminary Study [5.5784148764236114]
Large language models (LLMs) can be used to generate commit messages automatically and effectively. In 78% of the 366 samples, the commit messages generated by LLMs were evaluated by humans as the best.
arXiv Detail & Related papers (2024-01-11T14:06:39Z)
CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets [75.64181719386497]
We present CRAFT, a tool creation and retrieval framework for large language models (LLMs) It creates toolsets specifically curated for the tasks and equips LLMs with a component that retrieves tools from these sets to enhance their capability to solve complex tasks. Our method is designed to be flexible and offers a plug-and-play approach to adapt off-the-shelf LLMs to unseen domains and modalities, without any finetuning.
arXiv Detail & Related papers (2023-09-29T17:40:26Z)
LLM and Infrastructure as a Code use case [0.0]
Document presents an inquiry into a solution for generating and managing YAML roles and playbooks. Our efforts are focused on identifying plausible directions and outlining the potential applications. For the purpose of this experiment, we have opted against the use of Lightspeed.
arXiv Detail & Related papers (2023-09-04T09:05:17Z)
Using an LLM to Help With Code Understanding [13.53616539787915]
Large language models (LLMs) are revolutionizing the process of writing code. Our plugin queries OpenAI's GPT-3.5-turbo model with four high-level requests without the user having to write explicit prompts. We evaluate this system in a user study with 32 participants, which confirms that using our plugin can aid task completion more than web search.
arXiv Detail & Related papers (2023-07-17T00:49:06Z)
Automated Code generation for Information Technology Tasks in YAML through Large Language Models [56.25231445614503]
We present Wisdom, a natural-language to-YAML code generation tool, aimed at improving IT automation productivity. We develop two novel performance metrics for YAML and to capture the specific characteristics of this domain.
arXiv Detail & Related papers (2023-05-02T21:01:01Z)
Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models [68.37431984231338]
Large language models (LLMs) have shown impressive results across a variety of tasks while requiring little or no direct supervision. We believe the ability of an LLM to an attribute to the text that it generates is likely to be crucial for both system developers and users in this setting.
arXiv Detail & Related papers (2022-12-15T18:45:29Z)
Interactive Code Generation via Test-Driven User-Intent Formalization [60.90035204567797]
Large language models (LLMs) produce code from informal natural language (NL) intent. It is hard to define a notion of correctness since natural language can be ambiguous and lacks a formal semantics. We describe a language-agnostic abstract algorithm and a concrete implementation TiCoder.
arXiv Detail & Related papers (2022-08-11T17:41:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.