Related papers: Securing LLM-as-a-Service for Small Businesses: An Industry Case Study of a Distributed Chatbot Deployment Platform

Securing LLM-as-a-Service for Small Businesses: An Industry Case Study of a Distributed Chatbot Deployment Platform

URL: http://arxiv.org/abs/2601.15528v1
Date: Wed, 21 Jan 2026 23:29:32 GMT
Title: Securing LLM-as-a-Service for Small Businesses: An Industry Case Study of a Distributed Chatbot Deployment Platform
Authors: Jiazhu Xie, Bowen Li, Heyu Fu, Chong Gao, Ziqi Xu, Fengling Han,
Abstract summary: Large Language Model (LLM)-based question-answering systems offer significant potential for automating customer support and internal knowledge access in small businesses.<n>This paper presents an open-source, multi-tenant platform that enables small businesses to deploy customised LLM-based support chatbots via a no-code workflow.<n>The platform is built on distributed, lightweight k3s clusters spanning heterogeneous, low-cost machines and interconnected through an encrypted overlay network.
Score: 5.6063901772542835
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Model (LLM)-based question-answering systems offer significant potential for automating customer support and internal knowledge access in small businesses, yet their practical deployment remains challenging due to infrastructure costs, engineering complexity, and security risks, particularly in retrieval-augmented generation (RAG)-based settings. This paper presents an industry case study of an open-source, multi-tenant platform that enables small businesses to deploy customised LLM-based support chatbots via a no-code workflow. The platform is built on distributed, lightweight k3s clusters spanning heterogeneous, low-cost machines and interconnected through an encrypted overlay network, enabling cost-efficient resource pooling while enforcing container-based isolation and per-tenant data access controls. In addition, the platform integrates practical, platform-level defences against prompt injection attacks in RAG-based chatbots, translating insights from recent prompt injection research into deployable security mechanisms without requiring model retraining or enterprise-scale infrastructure. We evaluate the proposed platform through a real-world e-commerce deployment, demonstrating that secure and efficient LLM-based chatbot services can be achieved under realistic cost, operational, and security constraints faced by small businesses.

Related papers

MegaFlow: Large-Scale Distributed Orchestration System for the Agentic Era [74.42509044145417]
MegaFlow is a large-scale distributed orchestration system that enables efficient scheduling, resource allocation, and fine-grained task management for agent-environment workloads.<n>In our agent training deployments, MegaFlow successfully orchestrates tens of thousands of concurrent agent tasks while maintaining high system stability and achieving efficient resource utilization.
arXiv Detail & Related papers (2026-01-12T13:25:33Z)
Reliable LLM-Based Edge-Cloud-Expert Cascades for Telecom Knowledge Systems [54.916243942641444]
Large language models (LLMs) are emerging as key enablers of automation in domains such as telecommunications.<n>We study an edge-cloud-expert cascaded LLM-based knowledge system that supports decision-making through a question-and-answer pipeline.
arXiv Detail & Related papers (2025-12-23T03:10:09Z)
Small Language Models for Phishing Website Detection: Cost, Performance, and Privacy Trade-Offs [0.6299766708197881]
Phishing websites pose a major cybersecurity threat, exploiting unsuspecting users and causing significant financial and organisational harm.<n>Traditional machine learning approaches for phishing detection often require extensive feature engineering, continuous retraining, and costly infrastructure maintenance.<n>This paper investigates the feasibility of small language models (SLMs) for detecting phishing websites using only their raw HTML code.
arXiv Detail & Related papers (2025-11-19T13:45:07Z)
Proposing a Framework for Machine Learning Adoption on Legacy Systems [1.675857332621569]
The integration of machine learning (ML) is critical for industrial competitiveness, yet its adoption is frequently stalled by the prohibitive costs and operational disruptions of upgrading legacy systems.<n>This paper introduces a pragmatic, API-based framework designed to overcome these challenges by strategically decoupling the ML model lifecycle from the production environment.
arXiv Detail & Related papers (2025-09-29T03:08:23Z)
A Systematic Survey of Model Extraction Attacks and Defenses: State-of-the-Art and Perspectives [65.3369988566853]
Recent studies have demonstrated that adversaries can replicate a target model's functionality.<n>Model Extraction Attacks pose threats to intellectual property, privacy, and system security.<n>We propose a novel taxonomy that classifies MEAs according to attack mechanisms, defense approaches, and computing environments.
arXiv Detail & Related papers (2025-08-20T19:49:59Z)
Enabling Secure and Ephemeral AI Workloads in Data Mesh Environments [3.322555975389833]
Many large enterprises have no efficient and effective way to support their Data and AI teams.<n>This paper proposes a key piece of the solution to the overall problem, in the form of an on-demand self-service data-platform infrastructure.
arXiv Detail & Related papers (2025-05-31T02:30:22Z)
Deep Learning Approaches for Anti-Money Laundering on Mobile Transactions: Review, Framework, and Directions [51.43521977132062]
Money laundering is a financial crime that obscures the origin of illicit funds.<n>The proliferation of mobile payment platforms and smart IoT devices has significantly complicated anti-money laundering investigations.<n>This paper conducts a comprehensive review of deep learning solutions and the challenges associated with their use in AML.
arXiv Detail & Related papers (2025-03-13T05:19:44Z)
Can LLMs Hack Enterprise Networks? Autonomous Assumed Breach Penetration-Testing Active Directory Networks [1.3124479769761592]
We introduce a novel prototype designed to employ Large Language Model (LLM)-driven autonomous systems.<n>Our system represents the first demonstration of a fully autonomous, LLM-driven framework capable of compromising accounts.<n>We find that the associated costs are competitive with, and often significantly lower than, those incurred by professional human pen-testers.
arXiv Detail & Related papers (2025-02-06T17:12:43Z)
Sustainable and Intelligent Public Facility Failure Management System Based on Large Language Models [14.776153063614244]
This paper presents a new Large Language Model (LLM)-based Smart Device Management framework.<n>We demonstrate its practical applicability and its capacity to significantly reduce budgetary constraints on public facilities.<n>We plan to extend the framework's scope to include a wider array of public facilities and to integrate it with cutting-edge cybersecurity technologies.
arXiv Detail & Related papers (2025-01-08T02:30:37Z)
Large Language Model as a Catalyst: A Paradigm Shift in Base Station Siting Optimization [62.16747639440893]
Large language models (LLMs) and their associated technologies advance, particularly in the realms of prompt engineering and agent engineering.<n>Our proposed framework incorporates retrieval-augmented generation (RAG) to enhance the system's ability to acquire domain-specific knowledge and generate solutions.
arXiv Detail & Related papers (2024-08-07T08:43:32Z)
Efficient Prompting for LLM-based Generative Internet of Things [88.84327500311464]
Large language models (LLMs) have demonstrated remarkable capacities on various tasks, and integrating the capacities of LLMs into the Internet of Things (IoT) applications has drawn much research attention recently. Due to security concerns, many institutions avoid accessing state-of-the-art commercial LLM services, requiring the deployment and utilization of open-source LLMs in a local network setting. We propose a LLM-based Generative IoT (GIoT) system deployed in the local network setting in this study.
arXiv Detail & Related papers (2024-06-14T19:24:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.