PyGen: A Collaborative Human-AI Approach to Python Package Creation
- URL: http://arxiv.org/abs/2411.08932v1
- Date: Wed, 13 Nov 2024 03:16:18 GMT
- Title: PyGen: A Collaborative Human-AI Approach to Python Package Creation
- Authors: Saikat Barua, Mostafizur Rahman, Md Jafor Sadek, Rafiul Islam, Shehnaz Khaled, Md. Shohrab Hossain,
- Abstract summary: Pygen is an automation platform designed to empower researchers, technologists, and hobbyists to bring abstract ideas to life as core, usable software tools written in Python.
By combining state-of-the-art language models with open-source code generation technologies, Pygen has significantly reduced the manual overhead of tool development.
- Score: 1.3348326328808557
- License:
- Abstract: The principles of automation and innovation serve as foundational elements for advancement in contemporary science and technology. Here, we introduce Pygen, an automation platform designed to empower researchers, technologists, and hobbyists to bring abstract ideas to life as core, usable software tools written in Python. Pygen leverages the immense power of autoregressive large language models to augment human creativity during the ideation, iteration, and innovation process. By combining state-of-the-art language models with open-source code generation technologies, Pygen has significantly reduced the manual overhead of tool development. From a user prompt, Pygen automatically generates Python packages for a complete workflow from concept to package generation and documentation. The findings of our work show that Pygen considerably enhances the researcher's productivity by enabling the creation of resilient, modular, and well-documented packages for various specialized purposes. We employ a prompt enhancement approach to distill the user's package description into increasingly specific and actionable. While being inherently an open-ended task, we have evaluated the generated packages and the documentation using Human Evaluation, LLM-based evaluation, and CodeBLEU, with detailed results in the results section. Furthermore, we documented our results, analyzed the limitations, and suggested strategies to alleviate them. Pygen is our vision of ethical automation, a framework that promotes inclusivity, accessibility, and collaborative development. This project marks the beginning of a large-scale effort towards creating tools where intelligent agents collaborate with humans to improve scientific and technological development substantially. Our code and generated examples are open-sourced at [https://github.com/GitsSaikat/Pygen]
Related papers
- Darkit: A User-Friendly Software Toolkit for Spiking Large Language Model [50.37090759139591]
Large language models (LLMs) have been widely applied in various practical applications, typically comprising billions of parameters.
The human brain, employing bio-plausible spiking mechanisms, can accomplish the same tasks while significantly reducing energy consumption.
We are releasing a software toolkit named DarwinKit (Darkit) to accelerate the adoption of brain-inspired large language models.
arXiv Detail & Related papers (2024-12-20T07:50:08Z) - Large Action Models: From Inception to Implementation [51.81485642442344]
Large Action Models (LAMs) are designed for action generation and execution within dynamic environments.
LAMs hold the potential to transform AI from passive language understanding to active task completion.
We present a comprehensive framework for developing LAMs, offering a systematic approach to their creation, from inception to deployment.
arXiv Detail & Related papers (2024-12-13T11:19:56Z) - The Future of Scientific Publishing: Automated Article Generation [0.0]
This study introduces a novel software tool leveraging large language model (LLM) prompts, designed to automate the generation of academic articles from Python code.
Python served as a foundational proof of concept; however, the underlying methodology and framework exhibit adaptability across various GitHub repo's.
The development was achieved without reliance on advanced language model agents, ensuring high fidelity in the automated generation of coherent and comprehensive academic content.
arXiv Detail & Related papers (2024-04-11T16:47:02Z) - Octopus: Embodied Vision-Language Programmer from Environmental Feedback [58.04529328728999]
Embodied vision-language models (VLMs) have achieved substantial progress in multimodal perception and reasoning.
To bridge this gap, we introduce Octopus, an embodied vision-language programmer that uses executable code generation as a medium to connect planning and manipulation.
Octopus is designed to 1) proficiently comprehend an agent's visual and textual task objectives, 2) formulate intricate action sequences, and 3) generate executable code.
arXiv Detail & Related papers (2023-10-12T17:59:58Z) - SoTaNa: The Open-Source Software Development Assistant [81.86136560157266]
SoTaNa is an open-source software development assistant.
It generates high-quality instruction-based data for the domain of software engineering.
It employs a parameter-efficient fine-tuning approach to enhance the open-source foundation model, LLaMA.
arXiv Detail & Related papers (2023-08-25T14:56:21Z) - pymdp: A Python library for active inference in discrete state spaces [52.85819390191516]
pymdp is an open-source package for simulating active inference in Python.
We provide the first open-source package for simulating active inference with POMDPs.
arXiv Detail & Related papers (2022-01-11T12:18:44Z) - PyHealth: A Python Library for Health Predictive Models [53.848478115284195]
PyHealth is an open-source Python toolbox for developing various predictive models on healthcare data.
The data preprocessing module enables the transformation of complex healthcare datasets into machine learning friendly formats.
The predictive modeling module provides more than 30 machine learning models, including established ensemble trees and deep neural network-based approaches.
arXiv Detail & Related papers (2021-01-11T22:02:08Z) - Landscape of R packages for eXplainable Artificial Intelligence [4.91155110560629]
The article is primarily devoted to the tools available in R, but since it is easy to integrate the Python code, we will also show examples for the most popular libraries from Python.
arXiv Detail & Related papers (2020-09-24T16:54:57Z) - PHOTONAI -- A Python API for Rapid Machine Learning Model Development [2.414341608751139]
PHOTONAI is a high-level Python API designed to simplify and accelerate machine learning model development.
It functions as a unifying framework allowing the user to easily access and combine algorithms from different toolboxes into custom algorithm sequences.
arXiv Detail & Related papers (2020-02-13T10:33:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.