GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant
- URL: http://arxiv.org/abs/2603.01059v1
- Date: Sun, 01 Mar 2026 11:29:25 GMT
- Title: GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant
- Authors: Zhuokang Shen, Yifan Wang, Hanyu Chen, Wenxuan Huang, Shaohui Lin,
- Abstract summary: GroupGPT is a token-efficient and privacy-preserving agentic framework for multi-user chat assistant.<n>We introduce MUIR, a benchmark dataset for multi-user chat assistant intervention reasoning.<n>Experiments demonstrate that GroupGPT produces accurate and well-timed responses.
- Score: 18.992005077080588
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in large language models (LLMs) have enabled increasingly capable chatbots. However, most existing systems focus on single-user settings and do not generalize well to multi-user group chats, where agents require more proactive and accurate intervention under complex, evolving contexts. Existing approaches typically rely on LLMs for both reasoning and generation, leading to high token consumption, limited scalability, and potential privacy risks. To address these challenges, we propose GroupGPT, a token-efficient and privacy-preserving agentic framework for multi-user chat assistant. GroupGPT adopts a small-large model collaborative architecture to decouple intervention timing from response generation, enabling efficient and accurate decision-making. The framework also supports multimodal inputs, including memes, images, videos, and voice messages. We further introduce MUIR, a benchmark dataset for multi-user chat assistant intervention reasoning. MUIR contains 2,500 annotated group chat segments with intervention labels and rationales, supporting evaluation of timing accuracy and response quality. We evaluate a range of models on MUIR, from large language models to smaller counterparts. Extensive experiments demonstrate that GroupGPT produces accurate and well-timed responses, achieving an average score of 4.72/5.0 in LLM-based evaluation, and is well received by users across diverse group chat scenarios. Moreover, GroupGPT reduces token usage by up to 3 times compared to baseline methods, while providing privacy sanitization of user messages before cloud transmission. Code is available at: https://github.com/Eliot-Shen/GroupGPT .
Related papers
- GCAgent: Enhancing Group Chat Communication through Dialogue Agents System [24.954331957757645]
GCAgent is a large language model (LLMs)-driven system for enhancing group chats communication with both entertainment- and utility-oriented dialogue agents.<n>The system comprises three tightly integrated modules: Agent Builder, Dialogue Manager, and Interface plugins.<n>In real-world deployments over 350 days, it increased message volume by 28.80%, significantly improving group activity and engagement.
arXiv Detail & Related papers (2026-03-05T14:55:57Z) - AMUSE: Audio-Visual Benchmark and Alignment Framework for Agentic Multi-Speaker Understanding [73.05946667683259]
Recent large language models (MLLMs) show strong perception but struggle in multi-speaker, dialogue-centric settings.<n>We introduce AMUSE, a benchmark designed around tasks that are inherently agentic.<n>We propose RAFT, a data-efficient agentic alignment framework that integrates reward optimization with intrinsic multimodal self-evaluation.
arXiv Detail & Related papers (2025-12-18T07:01:47Z) - Test-Time Scaling Strategies for Generative Retrieval in Multimodal Conversational Recommendations [70.94563079082751]
E-commerce has exposed the limitations of traditional product retrieval systems in managing complex, multi-turn user interactions.<n>We propose a novel framework that introduces test-time scaling into conversational multimodal product retrieval.<n>Our approach builds on a generative retriever, further augmented with a test-time reranking mechanism that improves retrieval accuracy and better aligns results with evolving user intent throughout the dialogue.
arXiv Detail & Related papers (2025-08-25T15:38:56Z) - A Framework for Generating Conversational Recommendation Datasets from Behavioral Interactions [2.0693204407592836]
We present ConvRecStudio, a framework that simulates realistic, multi-turn dialogs grounded in timestamped user-item interactions and reviews.<n>We apply ConvRecStudio to three domains -- MobileRec, Yelp, and Amazon Electronics -- producing over 12K multi-turn dialogs per dataset.
arXiv Detail & Related papers (2025-06-14T22:58:48Z) - Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale [53.059480071818136]
Large Language Models (LLMs) have emerged as personalized assistants for users across a wide range of tasks.<n> PERSONAMEM features curated user profiles with over 180 simulated user-LLM interaction histories.<n>We evaluate LLM chatbots' ability to identify the most suitable response according to the current state of the user's profile.
arXiv Detail & Related papers (2025-04-19T08:16:10Z) - Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex Models [66.24055500785657]
Traditional turn-based chat systems prevent users from verbally interacting with system while it is generating responses.
To overcome these limitations, we adapt existing LLMs to listen users while generating output and provide users with instant feedback.
We build a dataset consisting of alternating time slices of queries and responses as well as covering typical feedback types in instantaneous interactions.
arXiv Detail & Related papers (2024-06-22T03:20:10Z) - Multi-User Chat Assistant (MUCA): a Framework Using LLMs to Facilitate Group Conversations [3.6975712141698445]
Multi-User Chat Assistant (MUCA) is an LLM-based framework tailored for group discussions.
MUCA consists of three main modules: Sub-topic Generator, Dialog Analyzer, and Conversational Strategies Arbitrator.
This paper further proposes an LLM-based Multi-User Simulator (MUS) to ease MUCA's optimization.
arXiv Detail & Related papers (2024-01-10T02:22:21Z) - Enhancing Chat Language Models by Scaling High-quality Instructional
Conversations [91.98516412612739]
We first provide a systematically designed, diverse, informative, large-scale dataset of instructional conversations, UltraChat.
Our objective is to capture the breadth of interactions that a human might have with an AI assistant.
We fine-tune a LLaMA model to create a powerful conversational model, UltraLLaMA.
arXiv Detail & Related papers (2023-05-23T16:49:14Z) - A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on
Reasoning, Hallucination, and Interactivity [79.12003701981092]
We carry out an extensive technical evaluation of ChatGPT using 23 data sets covering 8 different common NLP application tasks.
We evaluate the multitask, multilingual and multi-modal aspects of ChatGPT based on these data sets and a newly designed multimodal dataset.
ChatGPT is 63.41% accurate on average in 10 different reasoning categories under logical reasoning, non-textual reasoning, and commonsense reasoning.
arXiv Detail & Related papers (2023-02-08T12:35:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.