Untangling Knots: Leveraging LLM for Error Resolution in Computational Notebooks
- URL: http://arxiv.org/abs/2405.01559v1
- Date: Tue, 26 Mar 2024 18:53:17 GMT
- Title: Untangling Knots: Leveraging LLM for Error Resolution in Computational Notebooks
- Authors: Konstantin Grotov, Sergey Titov, Yaroslav Zharov, Timofey Bryksin,
- Abstract summary: We propose a potential solution for resolving errors in computational notebooks via an iterative LLM-based agent.
We discuss the questions raised by this approach and share a novel dataset of computational notebooks containing bugs.
- Score: 4.318590074766604
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Computational notebooks became indispensable tools for research-related development, offering unprecedented interactivity and flexibility in the development process. However, these benefits come at the cost of reproducibility and an increased potential for bugs. There are many tools for bug fixing; however, they are generally targeted at the classical linear code. With the rise of code-fluent Large Language Models, a new stream of smart bug-fixing tools has emerged. However, the applicability of those tools is still problematic for non-linear computational notebooks. In this paper, we propose a potential solution for resolving errors in computational notebooks via an iterative LLM-based agent. We discuss the questions raised by this approach and share a novel dataset of computational notebooks containing bugs to facilitate the research of the proposed approach.
Related papers
- Debug Smarter, Not Harder: AI Agents for Error Resolution in Computational Notebooks [4.025358960630117]
We present an AI agent designed specifically for error resolution in a computational notebook.
We have developed an agentic system capable of exploring a notebook environment by interacting with it.
We evaluate our approach against the pre-existing single-action solution by comparing costs and conducting a user study.
arXiv Detail & Related papers (2024-10-18T11:55:34Z) - An Empirical Study on Self-correcting Large Language Models for Data Science Code Generation [1.335664823620186]
Large Language Models (LLMs) have recently advanced many applications on software engineering tasks.
CoT-SelfEvolve iteratively and automatically refines code through a self-correcting process.
arXiv Detail & Related papers (2024-08-28T09:19:09Z) - SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models [54.78329741186446]
We propose a novel paradigm that uses a code-based critic model to guide steps including question-code data construction, quality control, and complementary evaluation.
Experiments across both in-domain and out-of-domain benchmarks in English and Chinese demonstrate the effectiveness of the proposed paradigm.
arXiv Detail & Related papers (2024-08-28T06:33:03Z) - What's Wrong with Your Code Generated by Large Language Models? An Extensive Study [80.18342600996601]
Large language models (LLMs) produce code that is shorter yet more complicated as compared to canonical solutions.
We develop a taxonomy of bugs for incorrect codes that includes three categories and 12 sub-categories, and analyze the root cause for common bug types.
We propose a novel training-free iterative method that introduces self-critique, enabling LLMs to critique and correct their generated code based on bug types and compiler feedback.
arXiv Detail & Related papers (2024-07-08T17:27:17Z) - Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs [49.57641083688934]
We introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings.
Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines.
arXiv Detail & Related papers (2024-06-05T20:19:09Z) - Neuron Patching: Semantic-based Neuron-level Language Model Repair for Code Generation [32.178931149612644]
Large Language Models (LLMs) have already gained widespread adoption in software engineering, particularly in code generation tasks.
We propose textscMENT, a novel and effective model editing approach to repair LLMs in coding tasks.
textscMENT is effective, efficient, and reliable, capable of correcting a neural model by patching just one or two neurons.
arXiv Detail & Related papers (2023-12-08T20:28:08Z) - Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions.
We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training.
As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z) - Natural Language to Code Generation in Interactive Data Science
Notebooks [35.621936471322385]
We build ARCADE, a benchmark of 1082 code generation problems using the pandas data analysis framework in data science notebooks.
We develop PaChiNCo, a 62B code language model (LM) for Python computational notebooks, which significantly outperforms public code LMs.
arXiv Detail & Related papers (2022-12-19T05:06:00Z) - Measuring Coding Challenge Competence With APPS [54.22600767666257]
We introduce APPS, a benchmark for code generation.
Our benchmark includes 10,000 problems, which range from having simple one-line solutions to being substantial algorithmic challenges.
Recent models such as GPT-Neo can pass approximately 15% of the test cases of introductory problems.
arXiv Detail & Related papers (2021-05-20T17:58:42Z) - Learning by Fixing: Solving Math Word Problems with Weak Supervision [70.62896781438694]
Previous neural solvers of math word problems (MWPs) are learned with full supervision and fail to generate diverse solutions.
We introduce a textitweakly-supervised paradigm for learning MWPs.
Our method only requires the annotations of the final answers and can generate various solutions for a single problem.
arXiv Detail & Related papers (2020-12-19T03:10:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.