Contextual Augmented Multi-Model Programming (CAMP): A Hybrid Local-Cloud Copilot Framework
- URL: http://arxiv.org/abs/2410.15285v2
- Date: Sat, 05 Apr 2025 08:48:48 GMT
- Title: Contextual Augmented Multi-Model Programming (CAMP): A Hybrid Local-Cloud Copilot Framework
- Authors: Yuchen Wang, Shangxin Guo, Chee Wei Tan,
- Abstract summary: This paper presents CAMP, a multi-model AI-assisted programming framework that consists of a local model that employs Retrieval-Augmented Generation (RAG)<n>RAG retrieves contextual information from the cloud model to facilitate context-aware prompt construction.<n>The methodology is actualized in Copilot for Xcode, an AI-assisted programming tool crafted for the Apple software ecosystem.
- Score: 8.28588489551341
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The advancements in cloud-based Large Languages Models (LLMs) have revolutionized AI-assisted programming. However, their integration into certain local development environments like ones within the Apple software ecosystem (e.g., iOS apps, macOS) remains challenging due to computational demands and sandboxed constraints. This paper presents CAMP, a multi-model AI-assisted programming framework that consists of a local model that employs Retrieval-Augmented Generation (RAG) to retrieve contextual information from the codebase to facilitate context-aware prompt construction thus optimizing the performance of the cloud model, empowering LLMs' capabilities in local Integrated Development Environments (IDEs). The methodology is actualized in Copilot for Xcode, an AI-assisted programming tool crafted for Xcode that employs the RAG module to address software constraints and enables diverse generative programming tasks, including automatic code completion, documentation, error detection, and intelligent user-agent interaction. The results from objective experiments on generated code quality and subjective experiments on user adoption collectively demonstrate the pilot success of the proposed system and mark its significant contributions to the realm of AI-assisted programming.
Related papers
- Apple Intelligence Foundation Language Models: Tech Report 2025 [246.04717786298764]
We introduce two foundation language models that power Apple Intelligence features across Apple devices and services.<n>Both models are trained on large-scale multilingual and multimodal datasets sourced via responsible web crawling.<n>A new Swift-centric Foundation Models framework exposes guided generation, constrained tool calling, and LoRA adapter fine-tuning.
arXiv Detail & Related papers (2025-07-17T23:37:19Z) - SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving [90.32201622392137]
We present SwingArena, a competitive evaluation framework for Large Language Models (LLMs)<n>Unlike traditional static benchmarks, SwingArena models the collaborative process of software by pairing LLMs as iterations, who generate patches, and reviewers, who create test cases and verify the patches through continuous integration (CI) pipelines.
arXiv Detail & Related papers (2025-05-29T18:28:02Z) - MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering [57.156093929365255]
Gym-style framework for systematically reinforcement learning, evaluating, and improving autonomous large language model (LLM) agents.<n>MLE-Dojo covers diverse, open-ended MLE tasks carefully curated to reflect realistic engineering scenarios.<n>Its fully executable environment supports comprehensive agent training via both supervised fine-tuning and reinforcement learning.
arXiv Detail & Related papers (2025-05-12T17:35:43Z) - Automating Automotive Software Development: A Synergy of Generative AI and Formal Methods [4.469600208122469]
We propose to combine GenAI with model-driven engineering to automate automotive software development.<n>Our approach uses LLMs to convert free-text requirements into event chain descriptions and to generate platform-independent software components.<n>As a proof of concept, we used GPT-4o to implement our method and tested it in the CARLA simulation environment with ROS2.
arXiv Detail & Related papers (2025-05-05T09:29:13Z) - Skill Discovery for Software Scripting Automation via Offline Simulations with LLMs [63.10710876536337]
We propose an offline simulation framework to curate a software-specific skillset, a collection of verified scripts.
Our framework comprises two components: (1) task creation, using top-down functionality and bottom-up API synergy exploration to generate helpful tasks.
Experiments with Adobe Illustrator demonstrate that our framework significantly improves automation success rates, reduces response time, and saves runtime token costs.
arXiv Detail & Related papers (2025-04-29T04:03:37Z) - LLM Benchmarking with LLaMA2: Evaluating Code Development Performance Across Multiple Programming Languages [0.1906498126334485]
This paper evaluates the capabilities of the Llama 2-70B model in automating scientific applications written in programming languages.
We assess the model's capacity to generate code, documentation, and unit tests, as well as its ability to translate existing code between programming languages.
Our results indicate that while Llama 2-70B frequently generates syntactically correct and functional code for simpler numerical tasks, it encounters substantial difficulties with more complex, parallelized, or distributed computations.
arXiv Detail & Related papers (2025-03-24T23:46:14Z) - VisionCoder: Empowering Multi-Agent Auto-Programming for Image Processing with Hybrid LLMs [8.380216582290025]
This paper presents a multi-agent framework that collaboratively completes auto-programming tasks.
Each agent plays a distinct role in the software development cycle, collectively forming a virtual organisation.
By establishing a tree-structured thought distribution and development mechanism across project, module, and function levels, this framework offers a cost-effective and efficient solution.
arXiv Detail & Related papers (2024-10-25T01:52:15Z) - Self-Evolving Multi-Agent Collaboration Networks for Software Development [32.78667834175446]
We introduce EvoMAC, a novel self-evolving paradigm for MAC networks.
Inspired by traditional neural network training, EvoMAC obtains text-based environmental feedback.
We propose rSDE-Bench, a requirement-oriented software development benchmark.
arXiv Detail & Related papers (2024-10-22T12:20:23Z) - OpenHands: An Open Platform for AI Software Developers as Generalist Agents [109.8507367518992]
We introduce OpenHands, a platform for the development of AI agents that interact with the world in similar ways to a human developer.
We describe how the platform allows for the implementation of new agents, safe interaction with sandboxed environments for code execution, and incorporation of evaluation benchmarks.
arXiv Detail & Related papers (2024-07-23T17:50:43Z) - Agentless: Demystifying LLM-based Software Engineering Agents [12.19683999553113]
We build Agentless -- an agentless approach to automatically solve software development problems.
Compared to the verbose and complex setup of agent-based approaches, Agentless employs a simplistic three-phase process of localization, repair, and patch validation.
Our results on the popular SWE-bench Lite benchmark show that surprisingly the simplistic Agentless is able to achieve both the highest performance and low cost.
arXiv Detail & Related papers (2024-07-01T17:24:45Z) - Agent-Driven Automatic Software Improvement [55.2480439325792]
This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs)
The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation.
We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
arXiv Detail & Related papers (2024-06-24T15:45:22Z) - DevBench: A Comprehensive Benchmark for Software Development [72.24266814625685]
DevBench is a benchmark that evaluates large language models (LLMs) across various stages of the software development lifecycle.
Empirical studies show that current LLMs, including GPT-4-Turbo, fail to solve the challenges presented within DevBench.
Our findings offer actionable insights for the future development of LLMs toward real-world programming applications.
arXiv Detail & Related papers (2024-03-13T15:13:44Z) - MAgIC: Investigation of Large Language Model Powered Multi-Agent in
Cognition, Adaptability, Rationality and Collaboration [102.41118020705876]
Large Language Models (LLMs) have marked a significant advancement in the field of natural language processing.
As their applications extend into multi-agent environments, a need has arisen for a comprehensive evaluation framework.
This work introduces a novel benchmarking framework specifically tailored to assess LLMs within multi-agent settings.
arXiv Detail & Related papers (2023-11-14T21:46:27Z) - Octopus: Embodied Vision-Language Programmer from Environmental Feedback [58.04529328728999]
Embodied vision-language models (VLMs) have achieved substantial progress in multimodal perception and reasoning.
To bridge this gap, we introduce Octopus, an embodied vision-language programmer that uses executable code generation as a medium to connect planning and manipulation.
Octopus is designed to 1) proficiently comprehend an agent's visual and textual task objectives, 2) formulate intricate action sequences, and 3) generate executable code.
arXiv Detail & Related papers (2023-10-12T17:59:58Z) - Copilot for Xcode: Exploring AI-Assisted Programming by Prompting
Cloud-based Large Language Models [2.5272389610447856]
Copilot for Xcode is an AI-assisted programming tool for program composition and design to support human software developers.
By seamlessly integrating cloud-based Large Language Models (LLM) with Apple's local development environment, Xcode, this tool enhances productivity and unleashes creativity for software development in Apple software ecosystem.
arXiv Detail & Related papers (2023-07-08T09:11:19Z) - Natural Language Generation and Understanding of Big Code for
AI-Assisted Programming: A Review [9.355153561673855]
This paper focuses on transformer-based large language models (LLMs) trained using Big Code.
LLMs have played a crucial role in facilitating AI-assisted programming applications, including code generation, code completion, code translation, code refinement, code summarization, defect detection, and clone detection.
It explores the challenges and opportunities associated with incorporating NLP techniques with software naturalness in these applications.
arXiv Detail & Related papers (2023-07-04T21:26:51Z) - CodeRL: Mastering Code Generation through Pretrained Models and Deep
Reinforcement Learning [92.36705236706678]
"CodeRL" is a new framework for program synthesis tasks through pretrained LMs and deep reinforcement learning.
During inference, we introduce a new generation procedure with a critical sampling strategy.
For the model backbones, we extended the encoder-decoder architecture of CodeT5 with enhanced learning objectives.
arXiv Detail & Related papers (2022-07-05T02:42:15Z) - Technology Readiness Levels for Machine Learning Systems [107.56979560568232]
Development and deployment of machine learning systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end.
We have developed a proven systems engineering approach for machine learning development and deployment.
Our "Machine Learning Technology Readiness Levels" framework defines a principled process to ensure robust, reliable, and responsible systems.
arXiv Detail & Related papers (2021-01-11T15:54:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.