Agentic AI for Self-Driving Laboratories in Soft Matter: Taxonomy, Benchmarks,and Open Challenges
- URL: http://arxiv.org/abs/2601.17920v1
- Date: Sun, 25 Jan 2026 17:44:19 GMT
- Title: Agentic AI for Self-Driving Laboratories in Soft Matter: Taxonomy, Benchmarks,and Open Challenges
- Authors: Xuanzhou Chen, Audrey Wang, Stanley Yin, Hanyang Jiang, Dong Zhang,
- Abstract summary: Self-driving laboratories (SDLs) close the loop between experiment design, automated execution, and data-driven decision making.<n>This survey uses soft matter as a representative setting but focuses on the AI questions that arise in real laboratories.
- Score: 8.153488410654004
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-driving laboratories (SDLs) close the loop between experiment design, automated execution, and data-driven decision making, and they provide a demanding testbed for agentic AI under expensive actions, noisy and delayed feedback, strict feasibility and safety constraints, and non-stationarity. This survey uses soft matter as a representative setting but focuses on the AI questions that arise in real laboratories. We frame SDL autonomy as an agent environment interaction problem with explicit observations, actions, costs, and constraints, and we use this formulation to connect common SDL pipelines to established AI principles. We review the main method families that enable closed loop experimentation, including Bayesian optimization and active learning for sample efficient experiment selection, planning and reinforcement learning for long horizon protocol optimization, and tool using agents that orchestrate heterogeneous instruments and software. We emphasize verifiable and provenance aware policies that support debugging, reproducibility, and safe operation. We then propose a capability driven taxonomy that organizes systems by decision horizon, uncertainty modeling, action parameterization, constraint handling, failure recovery, and human involvement. To enable meaningful comparison, we synthesize benchmark task templates and evaluation metrics that prioritize cost aware performance, robustness to drift, constraint violation behavior, and reproducibility. Finally, we distill lessons from deployed SDLs and outline open challenges in multi-modal representation, calibrated uncertainty, safe exploration, and shared benchmark infrastructure.
Related papers
- Learning Optimal and Sample-Efficient Decision Policies with Guarantees [3.096615629099617]
This thesis addresses the problem of learning from offline datasets in the presence of hidden confounders.<n>We derive a sample-efficient algorithm for solving conditional moment restrictions problems with convergence and optimality guarantees.<n>We also develop an algorithm that can learn effective imitator policies with convergence rate guarantees.
arXiv Detail & Related papers (2026-02-20T04:24:49Z) - On the Paradoxical Interference between Instruction-Following and Task Solving [50.75960598434753]
Instruction following aims to align Large Language Models (LLMs) with human intent by specifying explicit constraints on how tasks should be performed.<n>We reveal a counterintuitive phenomenon: instruction following can paradoxically interfere with LLMs' task-solving capability.<n>We propose a metric, SUSTAINSCORE, to quantify the interference of instruction following with task solving.
arXiv Detail & Related papers (2026-01-29T17:48:56Z) - Reliable LLM-Based Edge-Cloud-Expert Cascades for Telecom Knowledge Systems [54.916243942641444]
Large language models (LLMs) are emerging as key enablers of automation in domains such as telecommunications.<n>We study an edge-cloud-expert cascaded LLM-based knowledge system that supports decision-making through a question-and-answer pipeline.
arXiv Detail & Related papers (2025-12-23T03:10:09Z) - SelfAI: Building a Self-Training AI System with LLM Agents [79.10991818561907]
SelfAI is a general multi-agent platform that combines a User Agent for translating high-level research objectives into standardized experimental configurations.<n>An Experiment Manager orchestrates parallel, fault-tolerant training across heterogeneous hardware while maintaining a structured knowledge base for continuous feedback.<n>Across regression, computer vision, scientific computing, medical imaging, and drug discovery benchmarks, SelfAI consistently achieves strong performance and reduces redundant trials.
arXiv Detail & Related papers (2025-11-29T09:18:39Z) - Approaches to Analysis and Design of AI-Based Autonomous Vehicles [12.672967565682724]
closing the loop through AI-based feedback may pose significant risks on reliability of autonomous driving.<n>This paper aims to develop tools for modeling, analysis, and synthesis for a class of AI-based AV.
arXiv Detail & Related papers (2025-09-15T17:32:29Z) - Technical Report: Facilitating the Adoption of Causal Inference Methods Through LLM-Empowered Co-Pilot [44.336297829718795]
We introduce CATE-B, an open-source co-pilot system that uses large language models (LLMs) within an agentic framework to guide users through treatment effect estimation.<n>CATE-B assists in (i) constructing a structural causal model via causal discovery and LLM-based edge orientation, (ii) identifying robust adjustment sets through a novel Minimal Uncertainty Adjustment Set criterion, and (iii) selecting appropriate regression methods tailored to the causal structure and dataset characteristics.
arXiv Detail & Related papers (2025-08-14T12:20:51Z) - Interactive Agents to Overcome Ambiguity in Software Engineering [61.40183840499932]
AI agents are increasingly being deployed to automate tasks, often based on ambiguous and underspecified user instructions.<n>Making unwarranted assumptions and failing to ask clarifying questions can lead to suboptimal outcomes.<n>We study the ability of LLM agents to handle ambiguous instructions in interactive code generation settings by evaluating proprietary and open-weight models on their performance.
arXiv Detail & Related papers (2025-02-18T17:12:26Z) - Cognitive Edge Computing: A Comprehensive Survey on Optimizing Large Models and AI Agents for Pervasive Deployment [12.921833067052928]
This article surveys Cognitive Edge Computing as a practical and methodical pathway for deploying reasoning-capable Large Language Models (LLMs) and autonomous AI agents on resource-constrained devices at the network edge.<n>We present a unified, cognition-preserving framework aimed at retaining multi-step reasoning under tight memory/compute budgets.<n>We synthesize advances in efficient Transformer design, multimodal integration, hardware-aware compilation, privacy-preserving learning, and agentic tool use, and map them to edge-specific operating envelopes.
arXiv Detail & Related papers (2025-01-04T06:17:48Z) - When Demonstrations Meet Generative World Models: A Maximum Likelihood
Framework for Offline Inverse Reinforcement Learning [62.00672284480755]
This paper aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent.
Accurate models of expertise in executing a task has applications in safety-sensitive applications such as clinical decision making and autonomous driving.
arXiv Detail & Related papers (2023-02-15T04:14:20Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Reinforcement Learning with a Terminator [80.34572413850186]
We learn the parameters of the TerMDP and leverage the structure of the estimation problem to provide state-wise confidence bounds.
We use these to construct a provably-efficient algorithm, which accounts for termination, and bound its regret.
arXiv Detail & Related papers (2022-05-30T18:40:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.