Serenity: Library Based Python Code Analysis for Code Completion and
Automated Machine Learning
- URL: http://arxiv.org/abs/2301.05108v1
- Date: Thu, 5 Jan 2023 02:09:08 GMT
- Title: Serenity: Library Based Python Code Analysis for Code Completion and
Automated Machine Learning
- Authors: Wenting Zhao, Ibrahim Abdelaziz, Julian Dolby, Kavitha Srinivas,
Mossad Helali, Essam Mansour
- Abstract summary: We present a framework for static analysis of Python that turns out to be sufficient for some tasks.
Serenity exploits two basic mechanisms: (a) reliance on dynamic dispatch at the core of language translation, and (b) extreme abstraction of libraries.
We demonstrate the efficiency and usefulness of Serenity's analysis in two applications: code completion and automated machine learning.
- Score: 8.362734311902278
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dynamically typed languages such as Python have become very popular. Among
other strengths, Python's dynamic nature and its straightforward linking to
native code have made it the de-facto language for many research areas such as
Artificial Intelligence. This flexibility, however, makes static analysis very
hard. While creating a sound, or a soundy, analysis for Python remains an open
problem, we present in this work Serenity, a framework for static analysis of
Python that turns out to be sufficient for some tasks. The Serenity framework
exploits two basic mechanisms: (a) reliance on dynamic dispatch at the core of
language translation, and (b) extreme abstraction of libraries, to generate an
abstraction of the code. We demonstrate the efficiency and usefulness of
Serenity's analysis in two applications: code completion and automated machine
learning. In these two applications, we demonstrate that such analysis has a
strong signal, and can be leveraged to establish state-of-the-art performance,
comparable to neural models and dynamic analysis respectively.
Related papers
- A Comprehensive Guide to Combining R and Python code for Data Science, Machine Learning and Reinforcement Learning [42.350737545269105]
We show how to run Python's scikit-learn, pytorch and OpenAI gym libraries for building Machine Learning, Deep Learning, and Reinforcement Learning projects easily.
arXiv Detail & Related papers (2024-07-19T23:01:48Z) - Improving Arithmetic Reasoning Ability of Large Language Models through Relation Tuples, Verification and Dynamic Feedback [14.938401898546553]
We propose to use a semi-structured form to represent reasoning steps of large language models.
Specifically, we use relations, which are not only human but also machine-friendly and easier to verify than natural language.
arXiv Detail & Related papers (2024-06-25T18:21:00Z) - DyPyBench: A Benchmark of Executable Python Software [18.129031749321058]
We present DyPyBench, the first benchmark of Python projects that is large scale, diverse, ready to run and ready to analyze.
The benchmark encompasses 50 popular opensource projects from various application domains, with a total of 681k lines of Python code, and 30k test cases.
We envision DyPyBench to provide a basis for other dynamic analyses and for studying the runtime behavior of Python code.
arXiv Detail & Related papers (2024-03-01T13:53:15Z) - Context-Sensitive Abstract Interpretation of Dynamic Languages [0.0]
There is a vast gap in the quality of IDE tooling between static languages like Java and dynamic languages like Python or JavaScript.
Modern frameworks and libraries in these languages heavily use their dynamic capabilities to achieve the best ergonomics and readability.
We propose an algorithm that can bridge this gap by statically analyzing dynamic metaprogramming and runtime in programs.
arXiv Detail & Related papers (2024-01-31T17:45:05Z) - LILO: Learning Interpretable Libraries by Compressing and Documenting Code [71.55208585024198]
We introduce LILO, a neurosymbolic framework that iteratively synthesizes, compresses, and documents code.
LILO combines LLM-guided program synthesis with recent algorithmic advances in automated from Stitch.
We find that AutoDoc boosts performance by helping LILO's synthesizer to interpret and deploy learned abstractions.
arXiv Detail & Related papers (2023-10-30T17:55:02Z) - A Static Evaluation of Code Completion by Large Language Models [65.18008807383816]
Execution-based benchmarks have been proposed to evaluate functional correctness of model-generated code on simple programming problems.
static analysis tools such as linters, which can detect errors without running the program, haven't been well explored for evaluating code generation models.
We propose a static evaluation framework to quantify static errors in Python code completions, by leveraging Abstract Syntax Trees.
arXiv Detail & Related papers (2023-06-05T19:23:34Z) - Pre-Trained Language Models for Interactive Decision-Making [72.77825666035203]
We describe a framework for imitation learning in which goals and observations are represented as a sequence of embeddings.
We demonstrate that this framework enables effective generalization across different environments.
For test tasks involving novel goals or novel scenes, initializing policies with language models improves task completion rates by 43.6%.
arXiv Detail & Related papers (2022-02-03T18:55:52Z) - Leveraging Language to Learn Program Abstractions and Search Heuristics [66.28391181268645]
We introduce LAPS (Language for Abstraction and Program Search), a technique for using natural language annotations to guide joint learning of libraries and neurally-guided search models for synthesis.
When integrated into a state-of-the-art library learning system (DreamCoder), LAPS produces higher-quality libraries and improves search efficiency and generalization.
arXiv Detail & Related papers (2021-06-18T15:08:47Z) - Exploring Software Naturalness through Neural Language Models [56.1315223210742]
The Software Naturalness hypothesis argues that programming languages can be understood through the same techniques used in natural language processing.
We explore this hypothesis through the use of a pre-trained transformer-based language model to perform code analysis tasks.
arXiv Detail & Related papers (2020-06-22T21:56:14Z) - OPFython: A Python-Inspired Optimum-Path Forest Classifier [68.8204255655161]
This paper proposes a Python-based Optimum-Path Forest framework, denoted as OPFython.
As OPFython is a Python-based library, it provides a more friendly environment and a faster prototyping workspace than the C language.
arXiv Detail & Related papers (2020-01-28T15:46:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.