Building Specialized Software-Assistant ChatBot with Graph-Based Retrieval-Augmented Generation
- URL: http://arxiv.org/abs/2511.05297v1
- Date: Fri, 07 Nov 2025 14:56:45 GMT
- Title: Building Specialized Software-Assistant ChatBot with Graph-Based Retrieval-Augmented Generation
- Authors: Mohammed Hilel, Yannis Karmim, Jean De Bodinat, Reda Sarehane, Antoine Gillon,
- Abstract summary: We introduce a Graph-based Retrieval-Augmented Generation framework that automatically converts enterprise web applications into state-action knowledge graphs.<n>The framework was co-developed with the AI enterprise RAKAM, in collaboration with Lemon Learning.
- Score: 0.815557531820863
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Digital Adoption Platforms (DAPs) have become essential tools for helping employees navigate complex enterprise software such as CRM, ERP, or HRMS systems. Companies like LemonLearning have shown how digital guidance can reduce training costs and accelerate onboarding. However, building and maintaining these interactive guides still requires extensive manual effort. Leveraging Large Language Models as virtual assistants is an appealing alternative, yet without a structured understanding of the target software, LLMs often hallucinate and produce unreliable answers. Moreover, most production-grade LLMs are black-box APIs, making fine-tuning impractical due to the lack of access to model weights. In this work, we introduce a Graph-based Retrieval-Augmented Generation framework that automatically converts enterprise web applications into state-action knowledge graphs, enabling LLMs to generate grounded and context-aware assistance. The framework was co-developed with the AI enterprise RAKAM, in collaboration with Lemon Learning. We detail the engineering pipeline that extracts and structures software interfaces, the design of the graph-based retrieval process, and the integration of our approach into production DAP workflows. Finally, we discuss scalability, robustness, and deployment lessons learned from industrial use cases.
Related papers
- Context-Aware Visual Prompting: Automating Geospatial Web Dashboards with Large Language Models and Agent Self-Validation for Decision Support [1.506501956463029]
Development of web-based dashboards for risk analysis and decision making often challenged by difficulty in big, multidimensional data.<n>We introduce a generative AI framework that automates the creation of interactive geospatial dashboards from user-defined inputs.
arXiv Detail & Related papers (2025-10-10T10:58:15Z) - AI-Guided Exploration of Large-Scale Codebases [0.0]
Large language models (LLMs) offer new opportunities to enhance code exploration.<n>Recent advancements in large language models (LLMs) offer new opportunities to enhance code exploration.<n>This work introduces a hybrid approach that integrates reverse engineering with LLM-guided, intent-aware visual exploration.
arXiv Detail & Related papers (2025-08-07T19:15:37Z) - Graph-Augmented Large Language Model Agents: Current Progress and Future Prospects [57.53024716739594]
Graph-augmented LLM Agents (GLA) enhance structure, continuity, and coordination in complex agent systems.<n>This paper offers a timely and comprehensive overview of recent advances and highlights key directions for future work.<n>We hope this paper can serve as a roadmap for future research on GLA and foster a deeper understanding of the role of graphs in GLA agent systems.
arXiv Detail & Related papers (2025-07-29T00:27:12Z) - Skill Discovery for Software Scripting Automation via Offline Simulations with LLMs [63.10710876536337]
We propose an offline simulation framework to curate a software-specific skillset, a collection of verified scripts.<n>Our framework comprises two components: (1) task creation, using top-down functionality and bottom-up API synergy exploration to generate helpful tasks.<n> Experiments with Adobe Illustrator demonstrate that our framework significantly improves automation success rates, reduces response time, and saves runtime token costs.
arXiv Detail & Related papers (2025-04-29T04:03:37Z) - From PowerPoint UI Sketches to Web-Based Applications: Pattern-Driven Code Generation for GIS Dashboard Development Using Knowledge-Augmented LLMs, Context-Aware Visual Prompting, and the React Framework [1.4367082420201918]
This paper introduces a knowledgeaugmented code generation framework for complex GIS applications.<n>The framework retrieves software engineering best practices, domain, and advanced technology stacks from a specialized knowledge base.
arXiv Detail & Related papers (2025-02-12T19:59:57Z) - AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials [53.376263056033046]
Existing approaches rely on expensive human annotation, making them unsustainable at scale.<n>We propose AgentTrek, a scalable data synthesis pipeline that generates web agent trajectories by leveraging publicly available tutorials.<n>Our fully automated approach significantly reduces data collection costs, achieving a cost of just $0.55 per high-quality trajectory without human annotators.
arXiv Detail & Related papers (2024-12-12T18:59:27Z) - Control Industrial Automation System with Large Language Model Agents [2.2369578015657954]
This paper introduces a framework for integrating large language models with industrial automation systems.<n>At the core of the framework are an agent system designed for industrial tasks, a structured prompting method, and an event-driven information modeling mechanism.<n>Our contribution includes a formal system design, proof-of-concept implementation, and a method for generating task-specific datasets.
arXiv Detail & Related papers (2024-09-26T16:19:37Z) - SOLO: A Single Transformer for Scalable Vision-Language Modeling [74.05173379908703]
We present SOLO, a single transformer for visiOn-Language mOdeling.<n>A unified single Transformer architecture, like SOLO, effectively addresses these scalability concerns in LVLMs.<n>In this paper, we introduce the first open-source training recipe for developing SOLO, an open-source 7B LVLM.
arXiv Detail & Related papers (2024-07-08T22:40:15Z) - Tool Learning in the Wild: Empowering Language Models as Automatic Tool Agents [56.822238860147024]
Augmenting large language models with external tools has emerged as a promising approach to extend their utility.<n>Previous methods manually parse tool documentation and create in-context demonstrations, transforming tools into structured formats for LLMs to use in their step-by-step reasoning.<n>We propose AutoTools, a framework that enables LLMs to automate the tool-use workflow.
arXiv Detail & Related papers (2024-05-26T11:40:58Z) - CRAFT: Customizing LLMs by Creating and Retrieving from Specialized
Toolsets [75.64181719386497]
We present CRAFT, a tool creation and retrieval framework for large language models (LLMs)
It creates toolsets specifically curated for the tasks and equips LLMs with a component that retrieves tools from these sets to enhance their capability to solve complex tasks.
Our method is designed to be flexible and offers a plug-and-play approach to adapt off-the-shelf LLMs to unseen domains and modalities, without any finetuning.
arXiv Detail & Related papers (2023-09-29T17:40:26Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - Exploring the potential of flow-based programming for machine learning
deployment in comparison with service-oriented architectures [8.677012233188968]
We argue that part of the reason is infrastructure that was not designed for activities around data collection and analysis.
We propose to consider flow-based programming with data streams as an alternative to commonly used service-oriented architectures for building software applications.
arXiv Detail & Related papers (2021-08-09T15:06:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.