Identifying Uncertainty in Self-Adaptive Robotics with Large Language Models
- URL: http://arxiv.org/abs/2504.20684v1
- Date: Tue, 29 Apr 2025 12:07:39 GMT
- Title: Identifying Uncertainty in Self-Adaptive Robotics with Large Language Models
- Authors: Hassan Sartaj, Jalil Boudjadar, Mirgita Frasheri, Shaukat Ali, Peter Gorm Larsen,
- Abstract summary: We evaluate the potential of large language models (LLMs) in enabling a systematic approach to identify uncertainties in self-adaptive robotics.<n>We analyzed 10 advanced LLMs with varying capabilities across four industrial-sized robotics case studies.<n>Results showed that practitioners agreed with 63-88% of the LLM responses and expressed strong interest in the practicality of LLMs for this purpose.
- Score: 4.638192191684079
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Future self-adaptive robots are expected to operate in highly dynamic environments while effectively managing uncertainties. However, identifying the sources and impacts of uncertainties in such robotic systems and defining appropriate mitigation strategies is challenging due to the inherent complexity of self-adaptive robots and the lack of comprehensive knowledge about the various factors influencing uncertainty. Hence, practitioners often rely on intuition and past experiences from similar systems to address uncertainties. In this article, we evaluate the potential of large language models (LLMs) in enabling a systematic and automated approach to identify uncertainties in self-adaptive robotics throughout the software engineering lifecycle. For this evaluation, we analyzed 10 advanced LLMs with varying capabilities across four industrial-sized robotics case studies, gathering the practitioners' perspectives on the LLM-generated responses related to uncertainties. Results showed that practitioners agreed with 63-88% of the LLM responses and expressed strong interest in the practicality of LLMs for this purpose.
Related papers
- How Uncertain Is the Grade? A Benchmark of Uncertainty Metrics for LLM-Based Automatic Assessment [30.331175047465408]
The rapid rise of large language models (LLMs) is reshaping the landscape of automatic assessment in education.<n> Output uncertainty is an inescapable challenge in automatic assessment.<n>Unreliable or poorly calibrated uncertainty estimates can lead to unstable downstream interventions.
arXiv Detail & Related papers (2026-02-17T21:46:52Z) - RobotArena $\infty$: Scalable Robot Benchmarking via Real-to-Sim Translation [47.79800816696372]
Real-world testing of manipulation policies is labor-intensive at scale, and difficult to reproduce.<n>Existing simulation benchmarks are similarly limited, as they train and test policies within the same synthetic domains.<n>In this paper, we introduce a new benchmarking framework that overcomes these challenges by shifting VLA evaluation into large-scale simulated augmented environments.
arXiv Detail & Related papers (2025-10-27T17:41:38Z) - Fundamentals of Building Autonomous LLM Agents [64.39018305018904]
This paper reviews the architecture and implementation methods of agents powered by large language models (LLMs)<n>The research aims to explore patterns to develop "agentic" LLMs that can automate complex tasks and bridge the performance gap with human capabilities.
arXiv Detail & Related papers (2025-10-10T10:32:39Z) - Towards Reliable LLM-based Robot Planning via Combined Uncertainty Estimation [68.106428321492]
Large language models (LLMs) demonstrate advanced reasoning abilities, enabling robots to understand natural language instructions and generate high-level plans with appropriate grounding.<n>LLMs hallucinations present a significant challenge, often leading to overconfident yet potentially misaligned or unsafe plans.<n>We present Combined Uncertainty estimation for Reliable Embodied planning (CURE), which decomposes the uncertainty into epistemic and intrinsic uncertainty, each estimated separately.
arXiv Detail & Related papers (2025-10-09T10:26:58Z) - Addressing Bias in LLMs: Strategies and Application to Fair AI-based Recruitment [49.81946749379338]
This work seeks to analyze the capacity of Transformers-based systems to learn demographic biases present in the data.<n>We propose a privacy-enhancing framework to reduce gender information from the learning pipeline as a way to mitigate biased behaviors in the final tools.
arXiv Detail & Related papers (2025-06-13T15:29:43Z) - LLMpatronous: Harnessing the Power of LLMs For Vulnerability Detection [0.0]
Large Language Models (LLMs) for vulnerability detection presents unique challenges.<n>Previous attempts employing machine learning models for vulnerability detection have proven ineffective.<n>We propose a robust AI-driven approach focused on mitigating these limitations.
arXiv Detail & Related papers (2025-04-25T15:30:40Z) - Safe LLM-Controlled Robots with Formal Guarantees via Reachability Analysis [0.6749750044497732]
This paper introduces a safety assurance framework for Large Language Models (LLMs)-controlled robots based on data-driven reachability analysis.<n>Our approach provides rigorous safety guarantees against unsafe behaviors without relying on explicit analytical models.
arXiv Detail & Related papers (2025-03-05T21:23:15Z) - Persuasion with Large Language Models: a Survey [49.86930318312291]
Large Language Models (LLMs) have created new disruptive possibilities for persuasive communication.
In areas such as politics, marketing, public health, e-commerce, and charitable giving, such LLM Systems have already achieved human-level or even super-human persuasiveness.
Our survey suggests that the current and future potential of LLM-based persuasion poses profound ethical and societal risks.
arXiv Detail & Related papers (2024-11-11T10:05:52Z) - MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs [55.20845457594977]
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making.
We present a process-based benchmark MR-Ben that demands a meta-reasoning skill.
Our meta-reasoning paradigm is especially suited for system-2 slow thinking.
arXiv Detail & Related papers (2024-06-20T03:50:23Z) - LLM-Driven Robots Risk Enacting Discrimination, Violence, and Unlawful Actions [3.1247504290622214]
Research has raised concerns about the potential for Large Language Models to produce discriminatory outcomes and unsafe behaviors in real-world robot experiments and applications.
We conduct an HRI-based evaluation of discrimination and safety criteria on several highly-rated LLMs.
Our results underscore the urgent need for systematic, routine, and comprehensive risk assessments and assurances to improve outcomes.
arXiv Detail & Related papers (2024-06-13T05:31:49Z) - The Role of Predictive Uncertainty and Diversity in Embodied AI and Robot Learning [10.271978575618169]
Uncertainty has long been a critical area of study in robotics, particularly when robots are equipped with analytical models.
This guide offers an overview of the importance of uncertainty and provides methods to quantify and evaluate it from an applications perspective.
arXiv Detail & Related papers (2024-05-06T05:04:59Z) - On the Vulnerability of LLM/VLM-Controlled Robotics [54.57914943017522]
We highlight vulnerabilities in robotic systems integrating large language models (LLMs) and vision-language models (VLMs) due to input modality sensitivities.<n>Our results show that simple input perturbations reduce task execution success rates by 22.2% and 14.6% in two representative LLM/VLM-controlled robotic systems.
arXiv Detail & Related papers (2024-02-15T22:01:45Z) - Robots That Ask For Help: Uncertainty Alignment for Large Language Model
Planners [85.03486419424647]
KnowNo is a framework for measuring and aligning the uncertainty of large language models.
KnowNo builds on the theory of conformal prediction to provide statistical guarantees on task completion.
arXiv Detail & Related papers (2023-07-04T21:25:12Z) - Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot
Learning [121.9708998627352]
Recent work has shown that, in practical robot learning applications, the effects of adversarial training do not pose a fair trade-off.
This work revisits the robustness-accuracy trade-off in robot learning by analyzing if recent advances in robust training methods and theory can make adversarial training suitable for real-world robot applications.
arXiv Detail & Related papers (2022-04-15T08:12:15Z) - Multi Agent System for Machine Learning Under Uncertainty in Cyber
Physical Manufacturing System [78.60415450507706]
Recent advancements in predictive machine learning has led to its application in various use cases in manufacturing.
Most research focused on maximising predictive accuracy without addressing the uncertainty associated with it.
In this paper, we determine the sources of uncertainty in machine learning and establish the success criteria of a machine learning system to function well under uncertainty.
arXiv Detail & Related papers (2021-07-28T10:28:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.