KOALA: a Configurable Tool for Collecting IDE Data When Solving Programming Tasks
- URL: http://arxiv.org/abs/2506.21266v1
- Date: Thu, 26 Jun 2025 13:48:38 GMT
- Title: KOALA: a Configurable Tool for Collecting IDE Data When Solving Programming Tasks
- Authors: Daniil Karol, Elizaveta Artser, Ilya Vlasov, Yaroslav Golubev, Hieke Keuning, Anastasiia Birillo,
- Abstract summary: KOALA is a tool for collecting data of students solving programming tasks in JetBrains IDEs.<n>It provides the students with the necessary tasks, enable or disable certain IDE features like code completion, and run surveys.<n>During problem solving, the plugin collects code snapshots at the configured granularity.<n>The collected data is sent to the server that comes with the tool, where it is stored and can be converted to the standardized ProgSnap2 format.
- Score: 1.9626657740463982
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Collecting data of students solving programming tasks is incredibly valuable for researchers and educators. It allows verifying that the students correctly apply the features and concepts they are taught, or finding students' misconceptions. However, existing data collection tools have limitations, e.g., no control over the granularity of the collected code, not collecting the specific events of the programming environment used, and overall being hard to configure. To overcome these limitations, we propose KOALA, a convenient and highly configurable tool for collecting code snapshots and feature usage from students solving programming tasks in JetBrains IDEs. The plugin can be installed in IDEs and configured to provide the students with the necessary tasks, enable or disable certain IDE features like code completion, and run surveys. During problem solving, the plugin collects code snapshots at the configured granularity, all IDE actions like running and debugging, as well as some data not collected in prior works, like employed hotkeys and switching focus between files. The collected data is sent to the server that comes with the tool, where it is stored and can be converted to the standardized ProgSnap2 format. To showcase the tool, we collected data from 28 students solving tasks in two courses within the IDE, highlighting some insights from this data.
Related papers
- debug-gym: A Text-Based Environment for Interactive Debugging [55.11603087371956]
Large Language Models (LLMs) are increasingly relied upon for coding tasks.<n>We posit that LLMs can benefit from the ability to interactively explore a to gather the information relevant to their task.<n>We present a textual environment, namely debug-gym, for developing LLM-based agents in an interactive coding setting.
arXiv Detail & Related papers (2025-03-27T14:43:28Z) - ToolCoder: A Systematic Code-Empowered Tool Learning Framework for Large Language Models [81.12673534903979]
Tool learning has emerged as a crucial capability for large language models (LLMs) to solve complex real-world tasks through interaction with external tools.<n>We propose ToolCoder, a novel framework that reformulates tool learning as a code generation task.
arXiv Detail & Related papers (2025-02-17T03:42:28Z) - In-IDE Programming Courses: Learning Software Development in a Real-World Setting [5.330251011543498]
JetBrains recently released the JetBrains Academy plugin, which customizes the IDE for learners.<n>We carried out eight one-hour interviews with students and developers who completed at least one course using the plugin.
arXiv Detail & Related papers (2025-01-29T16:34:22Z) - Towards Completeness-Oriented Tool Retrieval for Large Language Models [60.733557487886635]
Real-world systems often incorporate a wide array of tools, making it impractical to input all tools into Large Language Models.
Existing tool retrieval methods primarily focus on semantic matching between user queries and tool descriptions.
We propose a novel modelagnostic COllaborative Learning-based Tool Retrieval approach, COLT, which captures not only the semantic similarities between user queries and tool descriptions but also takes into account the collaborative information of tools.
arXiv Detail & Related papers (2024-05-25T06:41:23Z) - Code Compass: A Study on the Challenges of Navigating Unfamiliar Codebases [2.808331566391181]
We propose a novel tool, Code, to address these issues.
Our study highlights a significant gap in current tools and methodologies.
Our formative study demonstrates how effectively the tool reduces the time developers spend navigating documentation.
arXiv Detail & Related papers (2024-05-10T06:58:31Z) - Tool-Augmented LLMs as a Universal Interface for IDEs [0.768721532845575]
Large Language Models (LLMs) capable of both natural language dialogue and code generation lead to a discourse on the obsolescence of the concept of Integrated Development Environments (IDEs)
We envision a model that is able to perform complex actions involving multiple IDE features upon user command, stripping the user experience of the tedious work involved in searching through options and actions.
arXiv Detail & Related papers (2024-02-18T16:32:28Z) - JetTrain: IDE-Native Machine Learning Experiments [4.23507375452691]
JetTrain is an integrated development environments (IDEs) tool for launching machine learning (ML) experiments.
A user can write and debug code locally and then seamlessly run it remotely using on-demand hardware.
We argue that this approach can lower the entry barrier for ML training problems and increase experiment throughput.
arXiv Detail & Related papers (2024-02-16T17:53:08Z) - ControlLLM: Augment Language Models with Tools by Searching on Graphs [97.62758830255002]
We present ControlLLM, a novel framework that enables large language models (LLMs) to utilize multi-modal tools for solving real-world tasks.
Our framework comprises three key components: (1) a textittask decomposer that breaks down a complex task into clear subtasks with well-defined inputs and outputs; (2) a textitThoughts-on-Graph (ToG) paradigm that searches the optimal solution path on a pre-built tool graph; and (3) an textitexecution engine with a rich toolbox that interprets the solution path and runs the
arXiv Detail & Related papers (2023-10-26T21:57:21Z) - CRAFT: Customizing LLMs by Creating and Retrieving from Specialized
Toolsets [75.64181719386497]
We present CRAFT, a tool creation and retrieval framework for large language models (LLMs)
It creates toolsets specifically curated for the tasks and equips LLMs with a component that retrieves tools from these sets to enhance their capability to solve complex tasks.
Our method is designed to be flexible and offers a plug-and-play approach to adapt off-the-shelf LLMs to unseen domains and modalities, without any finetuning.
arXiv Detail & Related papers (2023-09-29T17:40:26Z) - All You Need Is Logs: Improving Code Completion by Learning from
Anonymous IDE Usage Logs [55.606644084003094]
We propose an approach for collecting completion usage logs from the users in an IDE.
We use them to train a machine learning based model for ranking completion candidates.
Our evaluation shows that using a simple ranking model trained on the past user behavior logs significantly improved code completion experience.
arXiv Detail & Related papers (2022-05-21T23:21:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.