Neurosymbolic Repair for Low-Code Formula Languages
- URL: http://arxiv.org/abs/2207.11765v1
- Date: Sun, 24 Jul 2022 15:56:03 GMT
- Title: Neurosymbolic Repair for Low-Code Formula Languages
- Authors: Rohan Bavishi, Harshit Joshi, Jos\'e Pablo Cambronero S\'anchez, Anna
Fariha, Sumit Gulwani, Vu Le, Ivan Radicek, Ashish Tiwari
- Abstract summary: Most users of low-code platforms, such as Excel and PowerApps, write programs in domain-specific formula languages.
We develop LaMirage, a LAst-MIle RepAir-engine GEnerator that combines symbolic and neural techniques to perform last-mile repair.
We compare LaMirage to state-of-the-art neural and symbolic approaches on 400 real Excel and PowerFx formulas.
- Score: 12.986749944196402
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most users of low-code platforms, such as Excel and PowerApps, write programs
in domain-specific formula languages to carry out nontrivial tasks. Often users
can write most of the program they want, but introduce small mistakes that
yield broken formulas. These mistakes, which can be both syntactic and
semantic, are hard for low-code users to identify and fix, even though they can
be resolved with just a few edits. We formalize the problem of producing such
edits as the last-mile repair problem. To address this problem, we developed
LaMirage, a LAst-MIle RepAir-engine GEnerator that combines symbolic and neural
techniques to perform last-mile repair in low-code formula languages. LaMirage
takes a grammar and a set of domain-specific constraints/rules, which jointly
approximate the target language, and uses these to generate a repair engine
that can fix formulas in that language. To tackle the challenges of localizing
the errors and ranking the candidate repairs, LaMirage leverages neural
techniques, whereas it relies on symbolic methods to generate candidate
repairs. This combination allows LaMirage to find repairs that satisfy the
provided grammar and constraints, and then pick the most natural repair. We
compare LaMirage to state-of-the-art neural and symbolic approaches on 400 real
Excel and PowerFx formulas, where LaMirage outperforms all baselines. We
release these benchmarks to encourage subsequent work in low-code domains.
Related papers
- DistiLRR: Transferring Code Repair for Low-Resource Programming Languages [57.62712191540067]
Distilling Low-Resource Repairs (DistiLRR) is an approach that transfers the reasoning and code generation ability from a teacher model to a student model.
Our results show that DistiLRR consistently outperforms baselines on low-resource languages, but has similar performance on high-resource languages.
arXiv Detail & Related papers (2024-06-21T05:05:39Z) - A Deep Dive into Large Language Models for Automated Bug Localization and Repair [12.756202755547024]
Large language models (LLMs) have shown impressive effectiveness in various software engineering tasks, including automated program repair (APR)
In this study, we take a deep dive into automated bug fixing utilizing LLMs.
This methodological separation of bug localization and fixing using different LLMs enables effective integration of diverse contextual information.
Toggle achieves the new state-of-the-art (SOTA) performance on the CodeXGLUE code refinement benchmark.
arXiv Detail & Related papers (2024-04-17T17:48:18Z) - A Novel Approach for Automatic Program Repair using Round-Trip
Translation with Large Language Models [50.86686630756207]
Research shows that grammatical mistakes in a sentence can be corrected by translating it to another language and back.
Current generative models for Automatic Program Repair (APR) are pre-trained on source code and fine-tuned for repair.
This paper proposes bypassing the fine-tuning step and using Round-Trip Translation (RTT): translation of code from one programming language to another programming or natural language, and back.
arXiv Detail & Related papers (2024-01-15T22:36:31Z) - Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning [90.13978453378768]
We introduce a comprehensive typology of factual errors in generated chart captions.
A large-scale human annotation effort provides insight into the error patterns and frequencies in captions crafted by various chart captioning models.
Our analysis reveals that even state-of-the-art models, including GPT-4V, frequently produce captions laced with factual inaccuracies.
arXiv Detail & Related papers (2023-12-15T19:16:21Z) - Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts.
Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness.
Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z) - Piloting Copilot and Codex: Hot Temperature, Cold Prompts, or Black
Magic? [5.714553194279462]
We investigate the various input parameters of two language models, and conduct a study to understand if variations of these input parameters can have a significant impact on the quality of the generated programs.
Our results showed that varying the input parameters can significantly improve the performance of language models.
arXiv Detail & Related papers (2022-10-26T13:28:14Z) - Repair Is Nearly Generation: Multilingual Program Repair with LLMs [9.610685299268825]
We introduce RING, a multilingual repair engine powered by a large language model trained on code (LLMC) such as Codex.
Taking inspiration from the way programmers manually fix bugs, we show that a prompt-based strategy that conceptualizes repair as localization, transformation, and candidate ranking, can successfully repair programs in multiple domains with minimal effort.
arXiv Detail & Related papers (2022-08-24T16:25:58Z) - Correcting Robot Plans with Natural Language Feedback [88.92824527743105]
We explore natural language as an expressive and flexible tool for robot correction.
We show that these transformations enable users to correct goals, update robot motions, and recover from planning errors.
Our method makes it possible to compose multiple constraints and generalizes to unseen scenes, objects, and sentences in simulated environments and real-world environments.
arXiv Detail & Related papers (2022-04-11T15:22:43Z) - Measuring Coding Challenge Competence With APPS [54.22600767666257]
We introduce APPS, a benchmark for code generation.
Our benchmark includes 10,000 problems, which range from having simple one-line solutions to being substantial algorithmic challenges.
Recent models such as GPT-Neo can pass approximately 15% of the test cases of introductory problems.
arXiv Detail & Related papers (2021-05-20T17:58:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.