Related papers: PRISM-XR: Empowering Privacy-Aware XR Collaboration with Multimodal Large Language Models

PRISM-XR: Empowering Privacy-Aware XR Collaboration with Multimodal Large Language Models

URL: http://arxiv.org/abs/2602.10154v1
Date: Mon, 09 Feb 2026 21:28:02 GMT
Title: PRISM-XR: Empowering Privacy-Aware XR Collaboration with Multimodal Large Language Models
Authors: Jiangong Chen, Mingyu Zhu, Bin Li,
Abstract summary: PRISM-XR is a novel framework that facilitates multi-user collaboration in XR environments by providing privacy-aware MLLM integration.<n>Our numerical evaluation results indicate that the proposed platform achieves nearly 90% accuracy in fulfilling user requests.
Score: 8.808170696228865
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multimodal Large Language Models (MLLMs) enhance collaboration in Extended Reality (XR) environments by enabling flexible object and animation creation through the combination of natural language and visual inputs. However, visual data captured by XR headsets includes real-world backgrounds that may contain irrelevant or sensitive user information, such as credit cards left on the table or facial identities of other users. Uploading those frames to cloud-based MLLMs poses serious privacy risks, particularly when such data is processed without explicit user consent. Additionally, existing colocation and synchronization mechanisms in commercial XR APIs rely on time-consuming, privacy-invasive environment scanning and struggle to adapt to the highly dynamic nature of MLLM-integrated XR environments. In this paper, we propose PRISM-XR, a novel framework that facilitates multi-user collaboration in XR by providing privacy-aware MLLM integration. PRISM-XR employs intelligent frame preprocessing on the edge server to filter sensitive data and remove irrelevant context before communicating with cloud generative AI models. Additionally, we introduce a lightweight registration process and a fully customizable content-sharing mechanism to enable efficient, accurate, and privacy-preserving content synchronization among users. Our numerical evaluation results indicate that the proposed platform achieves nearly 90% accuracy in fulfilling user requests and less than 0.27 seconds registration time while maintaining spatial inconsistencies of less than 3.5 cm. Furthermore, we conducted an IRB-approved user study with 28 participants, demonstrating that our system could automatically filter highly sensitive objects in over 90% of scenarios while maintaining strong overall usability.

Related papers

Synthetic Interaction Data for Scalable Personalization in Large Language Models [67.31884245564086]
We introduce a high-fidelity synthetic data generation framework called PersonaGym.<n>Unlike prior work that treats personalization as static persona-preference pairs, PersonaGym models a dynamic preference process.<n>We release PersonaAtlas, a large-scale, high-quality, and diverse synthetic dataset of high-fidelity multi-turn personalized interaction trajectories.
arXiv Detail & Related papers (2026-02-12T20:41:22Z)
Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented Generation [72.34977512403643]
Retrieval-Augmented Generation (RAG) has emerged as a powerful paradigm for enhancing large language models (LLMs) by retrieving relevant documents from an external corpus.<n>Existing RAG systems primarily focus on unimodal text documents, and often fall short in real-world scenarios where both queries and documents may contain mixed modalities (such as text and images)<n>We propose Nyx, a unified mixed-modal to mixed-modal retriever tailored for Universal Retrieval-Augmented Generation scenarios.
arXiv Detail & Related papers (2025-10-20T09:56:43Z)
Edge-Assisted Collaborative Fine-Tuning for Multi-User Personalized Artificial Intelligence Generated Content (AIGC) [38.59865959433328]
Cloud-based solutions aid in computation but often fall short in addressing privacy risks, personalization efficiency, and communication costs.<n>We propose a novel cluster-aware hierarchical federated aggregation framework.<n>We show that the framework achieves accelerated convergence while maintaining practical viability for scalable multi-user personalized AIGC services.
arXiv Detail & Related papers (2025-08-06T06:07:24Z)
RMTBench: Benchmarking LLMs Through Multi-Turn User-Centric Role-Playing [133.0641538589466]
RMTBench is a comprehensive textbfuser-centric bilingual role-playing benchmark featuring 80 diverse characters and over 8,000 dialogue rounds.<n>Our benchmark constructs dialogues based on explicit user motivations rather than character descriptions, ensuring alignment with practical user applications.<n>By shifting focus from character background to user intention fulfillment, RMTBench bridges the gap between academic evaluation and practical deployment requirements.
arXiv Detail & Related papers (2025-07-27T16:49:47Z)
CoSteer: Collaborative Decoding-Time Personalization via Local Delta Steering [80.54309860395763]
CoSteer is a novel collaborative framework that enables decoding-time personalization through localized delta steering.<n>We formulate token-level optimization as an online learning problem, where local delta vectors dynamically adjust the remote LLM's logits.<n>This approach preserves privacy by transmitting only the final steered tokens rather than raw data or intermediate vectors.
arXiv Detail & Related papers (2025-07-07T08:32:29Z)
PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language Models [10.050972891318324]
We propose a privacy preservation pipeline for protecting privacy and sensitive information during interactions between users and large language models.<n>We construct SensitiveQA, the first privacy open-ended question-answering dataset.<n>Our proposed solution employs a multi-stage strategy aimed at preemptively securing user information while simultaneously preserving the response quality of cloud-based LLMs.
arXiv Detail & Related papers (2025-02-19T09:17:07Z)
IRIS: An Immersive Robot Interaction System [29.868721218549993]
IRIS supports immersive interaction and data collection across diverse simulators and real-world scenarios.<n>It visualizes arbitrary rigid and deformable objects, robots from simulation, and integrates real-time sensor-generated point clouds for real-world applications.
arXiv Detail & Related papers (2025-02-05T15:56:26Z)
LLMER: Crafting Interactive Extended Reality Worlds with JSON Data Generated by Large Language Models [22.53412407516448]
The integration of Large Language Models (LLMs) with Extended Reality (XR) technologies offers the potential to build truly immersive XR environments.<n>The complexity of XR environments makes it difficult to accurately extract relevant contextual data and scene/object parameters from an overwhelming volume of XR artifacts.<n>To overcome these challenges, we introduce a novel framework that creates interactive worlds using LLMERs.
arXiv Detail & Related papers (2025-02-04T16:08:48Z)
Explainable XR: Understanding User Behaviors of XR Environments using LLM-assisted Analytics Framework [24.02808692450192]
We present Explainable XR, an end-to-end framework for analyzing user behavior in diverse XR environments.<n> Explainable XR addresses challenges in handling cross-virtuality - AR, VR, MR - transitions, multi-user collaborative application scenarios.
arXiv Detail & Related papers (2025-01-23T15:55:07Z)
Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining [67.87810796668981]
Information-Sensitive Cropping (ISC) and Self-Refining Dual Learning (SRDL)<n>Iris achieves state-of-the-art performance across multiple benchmarks with only 850K GUI annotations.<n>These improvements translate to significant gains in both web and OS agent downstream tasks.
arXiv Detail & Related papers (2024-12-13T18:40:10Z)
Robust Utility-Preserving Text Anonymization Based on Large Language Models [80.5266278002083]
Anonymizing text that contains sensitive information is crucial for a wide range of applications.<n>Existing techniques face the emerging challenges of the re-identification ability of large language models.<n>We propose a framework composed of three key components: a privacy evaluator, a utility evaluator, and an optimization component.
arXiv Detail & Related papers (2024-07-16T14:28:56Z)
Embedding Large Language Models into Extended Reality: Opportunities and Challenges for Inclusion, Engagement, and Privacy [37.061999275101904]
We argue for using large language models in XR by embedding them in avatars or as narratives to facilitate inclusion. We speculate that combining the information provided to LLM-powered spaces by users and the biometric data obtained might lead to novel privacy invasions.
arXiv Detail & Related papers (2024-02-06T11:19:40Z)
Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion [68.45737688496654]
We present a modular interactive VOS framework which decouples interaction-to-mask and mask propagation. We show that our method outperforms current state-of-the-art algorithms while requiring fewer frame interactions.
arXiv Detail & Related papers (2021-03-14T14:39:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.