Related papers: Keeping it Local, Tiny and Real: Automated Report Generation on Edge Computing Devices for Mechatronic-Based Cognitive Systems

Keeping it Local, Tiny and Real: Automated Report Generation on Edge Computing Devices for Mechatronic-Based Cognitive Systems

URL: http://arxiv.org/abs/2511.02507v1
Date: Tue, 04 Nov 2025 11:49:41 GMT
Title: Keeping it Local, Tiny and Real: Automated Report Generation on Edge Computing Devices for Mechatronic-Based Cognitive Systems
Authors: Nicolas Schuler, Lea Dewald, Jürgen Graf,
Abstract summary: We propose a pipeline for generating automated reports in natural language utilizing various multi-modal sensors.<n>In particular, we evaluate our implementation on a diverse dataset spanning multiple domains including indoor, outdoor and urban environments.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advancements in Deep Learning enable hardware-based cognitive systems, that is, mechatronic systems in general and robotics in particular with integrated Artificial Intelligence, to interact with dynamic and unstructured environments. While the results are impressive, the application of such systems to critical tasks like autonomous driving as well as service and care robotics necessitate the evaluation of large amount of heterogeneous data. Automated report generation for Mobile Robotics can play a crucial role in facilitating the evaluation and acceptance of such systems in various domains. In this paper, we propose a pipeline for generating automated reports in natural language utilizing various multi-modal sensors that solely relies on local models capable of being deployed on edge computing devices, thus preserving the privacy of all actors involved and eliminating the need for external services. In particular, we evaluate our implementation on a diverse dataset spanning multiple domains including indoor, outdoor and urban environments, providing quantitative as well as qualitative evaluation results. Various generated example reports and other supplementary materials are available via a public repository.

Related papers

Agentic Artificial Intelligence (AI): Architectures, Taxonomies, and Evaluation of Large Language Model Agents [14.448267395835721]
We propose a unified taxonomy that breaks agents into Perception, Brain, Planning, Action, Tool Use, and Collaboration.<n>We also group the environments in which these agents operate, including digital operating systems, embodied robotics, and other specialized domains.
arXiv Detail & Related papers (2026-01-18T19:51:16Z)
Multifunctional physical reservoir computing in soft tensegrity robots [2.0686733932673604]
Recent studies have demonstrated that the dynamics of physical systems can be utilized for the desired information processing.<n>In this study, we extend this approach to control and embed not only one but also multiple behaviors into a type of soft robot called a tensegrity robot.<n>We show that there exist "untrained attractors" in the state space of the system outside the training data.
arXiv Detail & Related papers (2025-07-29T04:33:02Z)
Humanoid Occupancy: Enabling A Generalized Multimodal Occupancy Perception System on Humanoid Robots [50.0783429451902]
Humanoid robot technology is advancing rapidly, with manufacturers introducing diverse visual perception modules tailored to specific scenarios.<n> occupancy-based representation has become widely recognized as particularly suitable for humanoid robots, as it provides both rich semantic and 3D geometric information essential for comprehensive environmental understanding.<n>We present Humanoid Occupancy, a generalized multimodal occupancy perception system that integrates hardware and software components, data acquisition devices, and a dedicated annotation pipeline.
arXiv Detail & Related papers (2025-07-27T10:47:00Z)
Robot Utility Models: General Policies for Zero-Shot Deployment in New Environments [26.66666135624716]
We present Robot Utility Models (RUMs), a framework for training and deploying zero-shot robot policies. RUMs can generalize to new environments without any finetuning. We train five utility models for opening cabinet doors, opening drawers, picking up napkins, picking up paper bags, and reorienting fallen objects.
arXiv Detail & Related papers (2024-09-09T17:59:50Z)
Agent AI: Surveying the Horizons of Multimodal Interaction [83.18367129924997]
"Agent AI" is a class of interactive systems that can perceive visual stimuli, language inputs, and other environmentally-grounded data. We envision a future where people can easily create any virtual reality or simulated scene and interact with agents embodied within the virtual environment.
arXiv Detail & Related papers (2024-01-07T19:11:18Z)
Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis [82.59451639072073]
General-purpose robots operate seamlessly in any environment, with any object, and utilize various skills to complete diverse tasks. As a community, we have been constraining most robotic systems by designing them for specific tasks, training them on specific datasets, and deploying them within specific environments. Motivated by the impressive open-set performance and content generation capabilities of web-scale, large-capacity pre-trained models, we devote this survey to exploring how foundation models can be applied to general-purpose robotics.
arXiv Detail & Related papers (2023-12-14T10:02:55Z)
Open-sourced Data Ecosystem in Autonomous Driving: the Present and Future [130.87142103774752]
This review systematically assesses over seventy open-source autonomous driving datasets. It offers insights into various aspects, such as the principles underlying the creation of high-quality datasets. It also delves into the scientific and technical challenges that warrant resolution.
arXiv Detail & Related papers (2023-12-06T10:46:53Z)
HARPS: An Online POMDP Framework for Human-Assisted Robotic Planning and Sensing [1.3678064890824186]
The Human Assisted Robotic Planning and Sensing (HARPS) framework is presented for active semantic sensing and planning in human-robot teams. This approach lets humans opportunistically impose model structure and extend the range of semantic soft data in uncertain environments. Simulations of a UAV-enabled target search application in a large-scale partially structured environment show significant improvements in time and belief state estimates.
arXiv Detail & Related papers (2021-10-20T00:41:57Z)
Cognitive architecture aided by working-memory for self-supervised multi-modal humans recognition [54.749127627191655]
The ability to recognize human partners is an important social skill to build personalized and long-term human-robot interactions. Deep learning networks have achieved state-of-the-art results and demonstrated to be suitable tools to address such a task. One solution is to make robots learn from their first-hand sensory data with self-supervision.
arXiv Detail & Related papers (2021-03-16T13:50:24Z)
Integrated Benchmarking and Design for Reproducible and Accessible Evaluation of Robotic Agents [61.36681529571202]
We describe a new concept for reproducible robotics research that integrates development and benchmarking. One of the central components of this setup is the Duckietown Autolab, a standardized setup that is itself relatively low-cost and reproducible. We validate the system by analyzing the repeatability of experiments conducted using the infrastructure and show that there is low variance across different robot hardware and across different remote labs.
arXiv Detail & Related papers (2020-09-09T15:31:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.