Prompt Migration: Stabilizing GenAI Applications with Evolving Large Language Models
- URL: http://arxiv.org/abs/2507.05573v1
- Date: Tue, 08 Jul 2025 01:20:12 GMT
- Title: Prompt Migration: Stabilizing GenAI Applications with Evolving Large Language Models
- Authors: Shivani Tripathi, Pushpanjali Nema, Aditya Halder, Shi Qiao, Alekh Jindal,
- Abstract summary: Generative AI is transforming business applications by enabling natural language interfaces and intelligent automation.<n>The underlying large language models (LLMs) are evolving rapidly and so prompting them consistently is a challenge.<n>This paper introduces the concept of prompt migration as a systematic approach to stabilizing GenAI applications amid changing LLMs.
- Score: 1.9471700863629544
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative AI is transforming business applications by enabling natural language interfaces and intelligent automation. However, the underlying large language models (LLMs) are evolving rapidly and so prompting them consistently is a challenge. This leads to inconsistent and unpredictable application behavior, undermining the reliability that businesses require for mission-critical workflows. In this paper, we introduce the concept of prompt migration as a systematic approach to stabilizing GenAI applications amid changing LLMs. Using the Tursio enterprise search application as a case study, we analyze the impact of successive GPT model upgrades, detail our migration framework including prompt redesign and a migration testbed, and demonstrate how these techniques restore application consistency. Our results show that structured prompt migration can fully recover the application reliability that was lost due to model drift. We conclude with practical lessons learned, emphasizing the need for prompt lifecycle management and robust testing to ensure dependable GenAI-powered business applications.
Related papers
- Analysis of AI Techniques for Orchestrating Edge-Cloud Application Migration [0.196629787330046]
We identify, analyze and compare selected state-of-the-art Artificial Intelligence (AI) planning and Reinforcement Learning (RL) approaches for solving the class of edge-cloud application migration problems.<n>The aim is to understand available techniques capable of orchestrating such application migration in emerging computing environments.
arXiv Detail & Related papers (2025-07-14T10:03:23Z) - LLMShot: Reducing snapshot testing maintenance via LLMs [0.5218155982819203]
Snapshot testing has emerged as a critical technique for UI validation in modern software development.<n>This paper introduces LLMShot, a novel framework that leverages Vision-Language Models (VLMs) to automatically analyze snapshot test failures.
arXiv Detail & Related papers (2025-07-14T08:47:19Z) - SV-LLM: An Agentic Approach for SoC Security Verification using Large Language Models [8.912091484067508]
We introduce SV-LLM, a novel multi-agent assistant system designed to automate and enhance system-on-chip (SoC) security verification.<n>By integrating specialized agents for tasks like verification question answering, security asset identification, threat modeling, test plan and property generation, vulnerability detection, and simulation-based bug validation, SV-LLM streamlines the workflow.<n>The system aims to reduce manual intervention, improve accuracy, and accelerate security analysis, supporting proactive identification and mitigation of risks early in the design cycle.
arXiv Detail & Related papers (2025-06-25T13:31:13Z) - AgentCPM-GUI: Building Mobile-Use Agents with Reinforcement Fine-Tuning [82.42421823672954]
AgentCPM-GUI is built for robust and efficient on-device GUI interaction.<n>Our training pipeline includes grounding-aware pre-training to enhance perception.<n>AgentCPM-GUI achieves state-of-the-art performance on five public benchmarks.
arXiv Detail & Related papers (2025-06-02T07:30:29Z) - MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering [57.156093929365255]
Gym-style framework for systematically reinforcement learning, evaluating, and improving autonomous large language model (LLM) agents.<n>MLE-Dojo covers diverse, open-ended MLE tasks carefully curated to reflect realistic engineering scenarios.<n>Its fully executable environment supports comprehensive agent training via both supervised fine-tuning and reinforcement learning.
arXiv Detail & Related papers (2025-05-12T17:35:43Z) - Invisible Traces: Using Hybrid Fingerprinting to identify underlying LLMs in GenAI Apps [0.0]
Fingerprinting of Large Language Models (LLMs) has become essential for ensuring the security and transparency of AI-integrated applications.<n>We introduce a novel fingerprinting framework designed to address these challenges by integrating static and dynamic fingerprinting techniques.<n>Our approach identifies architectural features and behavioral traits, enabling accurate and robust fingerprinting of LLMs in dynamic environments.
arXiv Detail & Related papers (2025-01-30T19:15:41Z) - GUI Agents with Foundation Models: A Comprehensive Survey [91.97447457550703]
This survey consolidates recent research on (M)LLM-based GUI agents.<n>We identify key challenges and propose future research directions.<n>We hope this survey will inspire further advancements in the field of (M)LLM-based GUI agents.
arXiv Detail & Related papers (2024-11-07T17:28:10Z) - RETAIN: Interactive Tool for Regression Testing Guided LLM Migration [8.378294455013284]
RETAIN (REgression Testing guided LLM migrAtIoN) is a tool designed explicitly for regression testing in LLM Migrations.
Our automatic evaluation and empirical user studies demonstrate that RETAIN, when compared to manual evaluation, enabled participants to identify twice as many errors, facilitated experimentation with 75% more prompts, and achieves 12% higher metric scores in a given time frame.
arXiv Detail & Related papers (2024-09-05T22:22:57Z) - Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks.
Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z) - Simultaneous Machine Translation with Large Language Models [51.470478122113356]
We investigate the possibility of applying Large Language Models to SimulMT tasks.
We conducted experiments using the textttLlama2-7b-chat model on nine different languages from the MUST-C dataset.
The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics.
arXiv Detail & Related papers (2023-09-13T04:06:47Z) - Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization [103.70896967077294]
This paper introduces a principled framework for reinforcing large language agents by learning a retrospective model.
Our proposed agent architecture learns from rewards across multiple environments and tasks, for fine-tuning a pre-trained language model.
Experimental results on various tasks demonstrate that the language agents improve over time.
arXiv Detail & Related papers (2023-08-04T06:14:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.