Related papers: Towards Agent-based Test Support Systems: An Unsupervised Environment Design Approach

Towards Agent-based Test Support Systems: An Unsupervised Environment Design Approach

URL: http://arxiv.org/abs/2508.14135v1
Date: Tue, 19 Aug 2025 12:43:32 GMT
Title: Towards Agent-based Test Support Systems: An Unsupervised Environment Design Approach
Authors: Collins O. Ogbodo, Timothy J. Rogers, Mattia Dal Borgo, David J. Wagg,
Abstract summary: This study introduces an agent-based decision support framework for adaptive sensor placement across dynamically changing modal test environments.<n>The framework formulates the problem using an under partially observable Markov decision process, enabling the training of a generalist reinforcement learning agent.<n>A detailed case study on a steel cantilever structure demonstrates the efficacy of the proposed method in optimising sensor locations across frequency segments.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Modal testing plays a critical role in structural analysis by providing essential insights into dynamic behaviour across a wide range of engineering industries. In practice, designing an effective modal test campaign involves complex experimental planning, comprising a series of interdependent decisions that significantly influence the final test outcome. Traditional approaches to test design are typically static-focusing only on global tests without accounting for evolving test campaign parameters or the impact of such changes on previously established decisions, such as sensor configurations, which have been found to significantly influence test outcomes. These rigid methodologies often compromise test accuracy and adaptability. To address these limitations, this study introduces an agent-based decision support framework for adaptive sensor placement across dynamically changing modal test environments. The framework formulates the problem using an underspecified partially observable Markov decision process, enabling the training of a generalist reinforcement learning agent through a dual-curriculum learning strategy. A detailed case study on a steel cantilever structure demonstrates the efficacy of the proposed method in optimising sensor locations across frequency segments, validating its robustness and real-world applicability in experimental settings.

Related papers

PEOAT: Personalization-Guided Evolutionary Question Assembly for One-Shot Adaptive Testing [26.605029691211538]
One-shot adaptive testing (OAT) aims to select a fixed set of optimal items for each test-taker in a one-time selection.<n>We propose a cognitive-enhanced evolutionary framework incorporating schema-preserving crossover and cognitively guided mutation to enable efficient exploration.<n>The effectiveness of PEOAT is validated through extensive experiments on two datasets, complemented by case studies that uncovered valuable insights.
arXiv Detail & Related papers (2025-11-29T10:38:25Z)
Grounded Test-Time Adaptation for LLM Agents [75.62784644919803]
Large language model (LLM)-based agents struggle to generalize to novel and complex environments.<n>We propose two strategies for adapting LLM agents by leveraging environment-specific information available during deployment.
arXiv Detail & Related papers (2025-11-06T22:24:35Z)
Stochastic Encodings for Active Feature Acquisition [100.47043816019888]
Active Feature Acquisition is an instance-wise, sequential decision making problem.<n>The aim is to dynamically select which feature to measure based on current observations, independently for each test instance.<n>Common approaches either use Reinforcement Learning, which experiences training difficulties, or greedily maximize the conditional mutual information of the label and unobserved features, which makes myopic.<n>We introduce a latent variable model, trained in a supervised manner. Acquisitions are made by reasoning about the features across many possible unobserved realizations in a latent space.
arXiv Detail & Related papers (2025-08-03T23:48:46Z)
Feature-Based vs. GAN-Based Learning from Demonstrations: When and Why [50.191655141020505]
This survey provides a comparative analysis of feature-based and GAN-based approaches to learning from demonstrations.<n>We argue that the dichotomy between feature-based and GAN-based methods is increasingly nuanced.
arXiv Detail & Related papers (2025-07-08T11:45:51Z)
Regression Testing Optimization for ROS-based Autonomous Systems: A Comprehensive Review of Techniques [6.978850097048969]
We present the first comprehensive survey systematically reviewing regression testing optimization techniques tailored for ROSAS.<n>We analyze and categorize 122 representative studies into regression test case prioritization, minimization, and selection methods.<n>We highlight major challenges specific to regression testing for ROSAS, including effectively prioritizing tests in response to frequent system modifications, efficiently minimizing redundant tests, and difficulty in accurately selecting impacted test cases.
arXiv Detail & Related papers (2025-06-19T07:43:36Z)
TestAgent: An Adaptive and Intelligent Expert for Human Assessment [62.060118490577366]
We propose TestAgent, a large language model (LLM)-powered agent designed to enhance adaptive testing through interactive engagement.<n>TestAgent supports personalized question selection, captures test-takers' responses and anomalies, and provides precise outcomes through dynamic, conversational interactions.
arXiv Detail & Related papers (2025-06-03T16:07:54Z)
Beyond Black-Box Benchmarking: Observability, Analytics, and Optimization of Agentic Systems [1.415098516077151]
The rise of agentic AI systems, where agents collaborate to perform diverse tasks, poses new challenges with observing, analyzing and optimizing their behavior.<n>Traditional evaluation and benchmarking approaches struggle to handle the non-deterministic, context-sensitive, and dynamic nature of these systems.<n>This paper explores key challenges and opportunities in analyzing and optimizing agentic systems across development, testing, and maintenance.
arXiv Detail & Related papers (2025-03-09T20:02:04Z)
Can We Validate Counterfactual Estimations in the Presence of General Network Interference? [13.49152464081862]
We introduce a framework that facilitates the use of machine learning tools for both estimation and validation in causal inference.<n>New distribution-preserving network bootstrap generates statistically-valid subpopulations from a single experiment's data.<n>Counterfactual cross-validation procedure adapts the principles of model validation to the unique constraints of causal settings.
arXiv Detail & Related papers (2025-02-03T06:51:04Z)
AExGym: Benchmarks and Environments for Adaptive Experimentation [7.948144726705323]
We present a benchmark for adaptive experimentation based on real-world datasets. We highlight prominent practical challenges to operationalizing adaptivity: non-stationarity, batched/delayed feedback, multiple outcomes and objectives, and external validity.
arXiv Detail & Related papers (2024-08-08T15:32:12Z)
Adaptive Experimentation When You Can't Experiment [55.86593195947978]
This paper introduces the emphconfounded pure exploration transductive linear bandit (textttCPET-LB) problem. Online services can employ a properly randomized encouragement that incentivizes users toward a specific treatment.
arXiv Detail & Related papers (2024-06-15T20:54:48Z)
Position: AI Evaluation Should Learn from How We Test Humans [65.36614996495983]
We argue that psychometrics, a theory originating in the 20th century for human assessment, could be a powerful solution to the challenges in today's AI evaluations.
arXiv Detail & Related papers (2023-06-18T09:54:33Z)
Dual Adaptive Representation Alignment for Cross-domain Few-shot Learning [58.837146720228226]
Few-shot learning aims to recognize novel queries with limited support samples by learning from base knowledge. Recent progress in this setting assumes that the base knowledge and novel query samples are distributed in the same domains. We propose to address the cross-domain few-shot learning problem where only extremely few samples are available in target domains.
arXiv Detail & Related papers (2023-06-18T09:52:16Z)
Adaptive Experimental Design and Counterfactual Inference [20.666734673282495]
This paper shares lessons learned regarding the challenges and pitfalls of naively using adaptive experimentation systems in industrial settings. We developed an adaptive experimental design framework for counterfactual inference based on these experiences.
arXiv Detail & Related papers (2022-10-25T22:29:16Z)
Test and Evaluation Framework for Multi-Agent Systems of Autonomous Intelligent Agents [0.0]
We consider the challenges of developing a unifying test and evaluation framework for complex ensembles of cyber-physical systems with embedded artificial intelligence. We propose a framework that incorporates test and evaluation throughout not only the development life cycle, but continues into operation as the system learns and adapts.
arXiv Detail & Related papers (2021-01-25T21:42:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.