Related papers: Measuring skill-based uplift from AI in a real biological laboratory

Measuring skill-based uplift from AI in a real biological laboratory

URL: http://arxiv.org/abs/2512.10960v1
Date: Wed, 29 Oct 2025 16:34:57 GMT
Title: Measuring skill-based uplift from AI in a real biological laboratory
Authors: Ethan Obie Romero-Severson, Tara Harvey, Nick Generous, Phillip M. Mach,
Abstract summary: We report the results of a pilot study that attempted to empirically measure the magnitude of emphskills-based uplift caused by access to an AI reasoning model.<n>We discuss these results in the context of future studies of the evolving relationship between AI and global biosecurity.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Understanding how AI systems are used by people in real situations that mirror aspects of both legitimate and illegitimate use is key to predicting the risks and benefits of AI systems. This is especially true in biological applications, where skill rather than knowledge is often the primary barrier for an untrained person. The challenge is that these studies are difficult to execute well and can take months to plan and run. Here we report the results of a pilot study that attempted to empirically measure the magnitude of \emph{skills-based uplift} caused by access to an AI reasoning model, compared with a control group that had only internet access. Participants -- drawn from a diverse pool of Los Alamos National Laboratory employees with no prior wet-lab experience -- were asked to transform \ecoli{} with a provided expression construct, induce expression of a reporter peptide, and have expression confirmed by mass spectrometry. We recorded quantitative outcomes (e.g., successful completion of experimental segments) and qualitative observations about how participants interacted with the AI system, the internet, laboratory equipment, and one another. We present the results of the study and lessons learned in designing and executing this type of study, and we discuss these results in the context of future studies of the evolving relationship between AI and global biosecurity.

Related papers

BABE: Biology Arena BEnchmark [51.53220868983288]
BABE is a benchmark designed to evaluate the experimental reasoning capabilities of biological AI systems.<n>Our benchmark provides a robust framework for assessing how well AI systems can reason like practicing scientists.
arXiv Detail & Related papers (2026-02-05T16:39:20Z)
Industrialized Deception: The Collateral Effects of LLM-Generated Misinformation on Digital Ecosystems [47.03825808787752]
This paper transitions from literature review to practical countermeasures.<n>We report on improved AI-generated content through Large Language Models (LLMs) and multimodal systems.<n>We discuss mitigation strategies including LLM-based detection, inoculation approaches, and the dual-use nature of generative AI.
arXiv Detail & Related papers (2026-01-29T16:42:22Z)
SelfAI: Building a Self-Training AI System with LLM Agents [79.10991818561907]
SelfAI is a general multi-agent platform that combines a User Agent for translating high-level research objectives into standardized experimental configurations.<n>An Experiment Manager orchestrates parallel, fault-tolerant training across heterogeneous hardware while maintaining a structured knowledge base for continuous feedback.<n>Across regression, computer vision, scientific computing, medical imaging, and drug discovery benchmarks, SelfAI consistently achieves strong performance and reduces redundant trials.
arXiv Detail & Related papers (2025-11-29T09:18:39Z)
AI Agents in Drug Discovery [1.9777700354742123]
Agentic AI systems could integrate diverse biomedical data, execute tasks, carry out experiments via robotic platforms, and iteratively refine hypotheses in closed loops.<n>We provide a conceptual and technical overview of agentic AI architectures, ranging from ReAct and Reflection to Supervisor and Swarm systems.<n>We illustrate their applications across key stages of drug discovery, including literature synthesis, toxicity prediction, automated protocol generation, small-molecule synthesis, drug repurposing, and end-to-end decision-making.
arXiv Detail & Related papers (2025-10-31T03:07:14Z)
LabOS: The AI-XR Co-Scientist That Sees and Works With Humans [51.025615465050635]
LabOS represents the first AI co-scientist that unites computational reasoning with physical experimentation.<n>By connecting multi-model AI agents, smart glasses, and human-AI collaboration, LabOS allows AI to see what scientists see, understand experimental context, and assist in real-time execution.
arXiv Detail & Related papers (2025-10-16T16:36:22Z)
Deliberate Lab: A Platform for Real-Time Human-AI Social Experiments [9.689197691319741]
Deliberate Lab is an open-source platform for large-scale, real-time behavioral experiments.<n>It supports both human participants and large language model (LLM)-based agents.
arXiv Detail & Related papers (2025-10-14T22:02:24Z)
Position: Intelligent Science Laboratory Requires the Integration of Cognitive and Embodied AI [98.19195693735487]
We propose the paradigm of Intelligent Science Laboratories (ISLs)<n>ISLs are a multi-layered, closed-loop framework that deeply integrates cognitive and embodied intelligence.<n>We argue that such systems are essential for overcoming the current limitations of scientific discovery.
arXiv Detail & Related papers (2025-06-24T13:31:44Z)
Autonomous Microscopy Experiments through Large Language Model Agents [4.241267255764773]
Large language models (LLMs) are revolutionizing self driving laboratories (SDLs) for materials research.<n>We introduce Artificially Intelligent Lab Assistant (AILA), a framework automating atomic force microscopy through LLM driven agents.<n>We find that state of the art models struggle with basic tasks and coordination scenarios.
arXiv Detail & Related papers (2024-12-18T09:35:28Z)
User-centric evaluation of explainability of AI with and for humans: a comprehensive empirical study [5.775094401949666]
This study is located in the Human-Centered Artificial Intelligence (HCAI) It focuses on the results of a user-centered assessment of commonly used eXplainable Artificial Intelligence (XAI) algorithms.
arXiv Detail & Related papers (2024-10-21T12:32:39Z)
Physical formula enhanced multi-task learning for pharmacokinetics prediction [54.13787789006417]
A major challenge for AI-driven drug discovery is the scarcity of high-quality data. We develop a formula enhanced mul-ti-task learning (PEMAL) method that predicts four key parameters of pharmacokinetics simultaneously. Our experiments reveal that PEMAL significantly lowers the data demand, compared to typical Graph Neural Networks.
arXiv Detail & Related papers (2024-04-16T07:42:55Z)
An Exploratory Study of AI System Risk Assessment from the Lens of Data Distribution and Uncertainty [4.99372598361924]
Deep learning (DL) has become a driving force and has been widely adopted in many domains and applications. This paper initiates an early exploratory study of AI system risk assessment from both the data distribution and uncertainty angles.
arXiv Detail & Related papers (2022-12-13T03:34:25Z)
The Role of AI in Drug Discovery: Challenges, Opportunities, and Strategies [97.5153823429076]
The benefits, challenges and drawbacks of AI in this field are reviewed. The use of data augmentation, explainable AI, and the integration of AI with traditional experimental methods are also discussed.
arXiv Detail & Related papers (2022-12-08T23:23:39Z)
Human-Robot Collaboration and Machine Learning: A Systematic Review of Recent Research [69.48907856390834]
Human-robot collaboration (HRC) is the approach that explores the interaction between a human and a robot. This paper proposes a thorough literature review of the use of machine learning techniques in the context of HRC.
arXiv Detail & Related papers (2021-10-14T15:14:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.