XR Blocks: Accelerating Human-centered AI + XR Innovation
- URL: http://arxiv.org/abs/2509.25504v1
- Date: Mon, 29 Sep 2025 21:00:53 GMT
- Title: XR Blocks: Accelerating Human-centered AI + XR Innovation
- Authors: David Li, Nels Numan, Xun Qian, Yanhe Chen, Zhongyi Zhou, Evgenii Alekseev, Geonsun Lee, Alex Cooper, Min Xia, Scott Chung, Jeremy Nelson, Xiuxiu Yuan, Jolica Dias, Tim Bettridge, Benjamin Hersh, Michelle Huynh, Konrad Piascik, Ricardo Cabello, David Kim, Ruofei Du,
- Abstract summary: XR Blocks is a cross-platform framework designed to accelerate human-centered AI + XR innovation.<n>It provides a modular architecture with plug-and-play components for core abstraction in AI + XR: user, world, peers; interface, context, and agents.
- Score: 15.103185935604323
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We are on the cusp where Artificial Intelligence (AI) and Extended Reality (XR) are converging to unlock new paradigms of interactive computing. However, a significant gap exists between the ecosystems of these two fields: while AI research and development is accelerated by mature frameworks like JAX and benchmarks like LMArena, prototyping novel AI-driven XR interactions remains a high-friction process, often requiring practitioners to manually integrate disparate, low-level systems for perception, rendering, and interaction. To bridge this gap, we present XR Blocks, a cross-platform framework designed to accelerate human-centered AI + XR innovation. XR Blocks strives to provide a modular architecture with plug-and-play components for core abstraction in AI + XR: user, world, peers; interface, context, and agents. Crucially, it is designed with the mission of "reducing frictions from idea to reality", thus accelerating rapid prototyping of AI + XR apps. Built upon accessible technologies (WebXR, three.js, TensorFlow, Gemini), our toolkit lowers the barrier to entry for XR creators. We demonstrate its utility through a set of open-source templates, samples, and advanced demos, empowering the community to quickly move from concept to interactive XR prototype. Site: https://xrblocks.github.io
Related papers
- Recent Advances and Future Directions in Extended Reality (XR): Exploring AI-Powered Spatial Intelligence [0.0]
Extended Reality (XR), encompassing Augmented Reality (AR), Virtual Reality (VR) and Mixed Reality (MR), is a transformative technology bridging the physical and virtual world.<n>This review examines XR's evolution through foundational framework - hardware ranging from monitors to sensors and software ranging from visual tasks to user interface.<n>For future directions, attention should be given to the integration of multi-modal AI and IoT-driven digital twins to enable adaptive XR systems.
arXiv Detail & Related papers (2025-04-22T15:11:55Z) - CUIfy the XR: An Open-Source Package to Embed LLM-powered Conversational Agents in XR [31.49021749468963]
Large language model (LLM)powered non-player characters (NPCs) with speech-to-text (STT) and text-to-speech (TTS) models bring significant advantages over conventional or pre-scripted NPCs for facilitating more natural conversational user interfaces (CUIs) in XR.<n>This paper provides the community with an open-source, customizable, extendable, and privacy-aware Unity package, CUIfy, that facilitates speech-based NPC-user interaction with widely used LLMs, STT, and TTS models.
arXiv Detail & Related papers (2024-11-07T12:55:17Z) - OpenHands: An Open Platform for AI Software Developers as Generalist Agents [109.8507367518992]
We introduce OpenHands, a platform for the development of AI agents that interact with the world in similar ways to a human developer.<n>We describe how the platform allows for the implementation of new agents, safe interaction with sandboxed environments for code execution, and incorporation of evaluation benchmarks.
arXiv Detail & Related papers (2024-07-23T17:50:43Z) - GRUtopia: Dream General Robots in a City at Scale [65.08318324604116]
This paper introduces project GRUtopia, the first simulated interactive 3D society designed for various robots.
GRScenes includes 100k interactive, finely annotated scenes, which can be freely combined into city-scale environments.
GRResidents is a Large Language Model (LLM) driven Non-Player Character (NPC) system that is responsible for social interaction.
arXiv Detail & Related papers (2024-07-15T17:40:46Z) - ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning [74.58666091522198]
We present a framework for intuitive robot programming by non-experts.
We leverage natural language prompts and contextual information from the Robot Operating System (ROS)
Our system integrates large language models (LLMs), enabling non-experts to articulate task requirements to the system through a chat interface.
arXiv Detail & Related papers (2024-06-28T08:28:38Z) - Artificial General Intelligence (AGI)-Native Wireless Systems: A Journey Beyond 6G [58.440115433585824]
Building future wireless systems that support services like digital twins (DTs) is challenging to achieve through advances to conventional technologies like meta-surfaces.
While artificial intelligence (AI)-native networks promise to overcome some limitations of wireless technologies, developments still rely on AI tools like neural networks.
This paper revisits the concept of AI-native wireless systems, equipping them with the common sense necessary to transform them into artificial general intelligence (AGI)-native systems.
arXiv Detail & Related papers (2024-04-29T04:51:05Z) - RoboScript: Code Generation for Free-Form Manipulation Tasks across Real
and Simulation [77.41969287400977]
This paper presents textbfRobotScript, a platform for a deployable robot manipulation pipeline powered by code generation.
We also present a benchmark for a code generation benchmark for robot manipulation tasks in free-form natural language.
We demonstrate the adaptability of our code generation framework across multiple robot embodiments, including the Franka and UR5 robot arms.
arXiv Detail & Related papers (2024-02-22T15:12:00Z) - AtomXR: Streamlined XR Prototyping with Natural Language and Immersive
Physical Interaction [2.02671066150924]
AtomXR is a streamlined, immersive, no-code XR prototyping tool designed to empower developers in creating applications using natural language, eye-gaze, and touch interactions.
AtomXR consists of: 1) AtomScript, a high-level human-interpretable scripting language for rapid prototyping, 2) a natural language interface that integrates LLMs and multimodal inputs for AtomScript generation, and 3) an immersive in-headset authoring environment.
Empirical evaluation through two user studies offers insights into natural language-based and immersive prototyping, and shows AtomXR provides significant improvements in speed and user experience compared to traditional systems
arXiv Detail & Related papers (2023-11-19T05:52:25Z) - Actor-Critic Network for O-RAN Resource Allocation: xApp Design,
Deployment, and Analysis [3.8073142980733]
Open Radio Access Network (O-RAN) has introduced an emerging RAN architecture that enables openness, intelligence, and automated control.
The RAN Intelligent Controller (RIC) provides the platform to design and deploy RAN controllers.
xApps are the applications which will take this responsibility by leveraging machine learning (ML) algorithms and acting in near-real time.
arXiv Detail & Related papers (2022-09-26T19:12:18Z) - A User-Centred Framework for Explainable Artificial Intelligence in
Human-Robot Interaction [70.11080854486953]
We propose a user-centred framework for XAI that focuses on its social-interactive aspect.
The framework aims to provide a structure for interactive XAI solutions thought for non-expert users.
arXiv Detail & Related papers (2021-09-27T09:56:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.