A Hierarchical Hybrid AI Approach: Integrating Deep Reinforcement Learning and Scripted Agents in Combat Simulations
- URL: http://arxiv.org/abs/2512.00249v1
- Date: Fri, 28 Nov 2025 23:50:29 GMT
- Title: A Hierarchical Hybrid AI Approach: Integrating Deep Reinforcement Learning and Scripted Agents in Combat Simulations
- Authors: Scotty Black, Christian Darken,
- Abstract summary: This paper introduces a novel hierarchical hybrid artificial intelligence (AI) approach that synergizes the reliability and predictability of scripted agents with the dynamic, adaptive learning capabilities of RL.<n>By structuring the AI system hierarchically, the proposed approach aims to utilize scripted agents for routine, tactical-level decisions and RL agents for higher-level, strategic decision-making.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the domain of combat simulations in support of wargaming, the development of intelligent agents has predominantly been characterized by rule-based, scripted methodologies with deep reinforcement learning (RL) approaches only recently being introduced. While scripted agents offer predictability and consistency in controlled environments, they fall short in dynamic, complex scenarios due to their inherent inflexibility. Conversely, RL agents excel in adaptability and learning, offering potential improvements in handling unforeseen situations, but suffer from significant challenges such as black-box decision-making processes and scalability issues in larger simulation environments. This paper introduces a novel hierarchical hybrid artificial intelligence (AI) approach that synergizes the reliability and predictability of scripted agents with the dynamic, adaptive learning capabilities of RL. By structuring the AI system hierarchically, the proposed approach aims to utilize scripted agents for routine, tactical-level decisions and RL agents for higher-level, strategic decision-making, thus addressing the limitations of each method while leveraging their individual strengths. This integration is shown to significantly improve overall performance, providing a robust, adaptable, and effective solution for developing and training intelligent agents in complex simulation environments.
Related papers
- Strategic Coordination for Evolving Multi-agent Systems: A Hierarchical Reinforcement and Collective Learning Approach [0.0]
Reinforcement learning offers a way to model sequential decision-making.<n>Agents take high-level strategies using MARL to group possible plans for action space reduction.<n>Low-level collective learning layer ensures efficient and decentralized coordinated decisions.
arXiv Detail & Related papers (2025-09-22T17:58:45Z) - AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning [129.44038804430542]
We introduce AgentGym-RL, a new framework to train LLM agents for multi-turn interactive decision-making through RL.<n>We propose ScalingInter-RL, a training approach designed for exploration-exploitation balance and stable RL optimization.<n>Our agents match or surpass commercial models on 27 tasks across diverse environments.
arXiv Detail & Related papers (2025-09-10T16:46:11Z) - The Landscape of Agentic Reinforcement Learning for LLMs: A Survey [103.32591749156416]
The emergence of agentic reinforcement learning (Agentic RL) marks a paradigm shift from conventional reinforcement learning applied to large language models (LLM RL)<n>This survey formalizes this conceptual shift by contrasting the degenerate single-step Markov Decision Processes (MDPs) of LLM-RL with the temporally extended, partially observable Markov decision processes (POMDPs) that define Agentic RL.
arXiv Detail & Related papers (2025-09-02T17:46:26Z) - Agentic Reinforced Policy Optimization [66.96989268893932]
Large-scale reinforcement learning with verifiable rewards (RLVR) has demonstrated its effectiveness in harnessing the potential of large language models (LLMs) for single-turn reasoning tasks.<n>Current RL algorithms inadequately balance the models' intrinsic long-horizon reasoning capabilities and their proficiency in multi-turn tool interactions.<n>We propose Agentic Reinforced Policy Optimization (ARPO), a novel agentic RL algorithm tailored for training multi-turn LLM-based agents.
arXiv Detail & Related papers (2025-07-26T07:53:11Z) - GenAI-based Multi-Agent Reinforcement Learning towards Distributed Agent Intelligence: A Generative-RL Agent Perspective [35.589506360952925]
We argue for a transformative paradigm shift from reactive to proactive multi-agent intelligence through generative AI-based reinforcement learning.<n>Rather than responding to immediate observations, generative-RL agents can model environment evolution, predict other agents' behaviors, generate coordinated action sequences, and engage in strategic reasoning accounting for long-term dynamics.
arXiv Detail & Related papers (2025-07-13T05:02:43Z) - UserCentrix: An Agentic Memory-augmented AI Framework for Smart Spaces [8.111700384985356]
Agentic AI, with its autonomous and proactive decision-making, has transformed smart environments.<n>This paper introduces UserCentrix, an agentic memory-augmented AI framework designed to enhance smart spaces through dynamic, context-aware decision-making.
arXiv Detail & Related papers (2025-05-01T11:54:49Z) - PolicyEvol-Agent: Evolving Policy via Environment Perception and Self-Awareness with Theory of Mind [9.587070290189507]
PolicyEvol-Agent is a comprehensive framework characterized by systematically acquiring intentions of others.<n>PolicyEvol-Agent integrates a range of cognitive operations with Theory of Mind alongside internal and external perspectives.
arXiv Detail & Related papers (2025-04-20T06:43:23Z) - Reinforcement Learning Environment with LLM-Controlled Adversary in D&D 5th Edition Combat [0.0]
This research employs Deep Q-Networks (DQN) for the smaller agents, creating a testbed for strategic AI development.<n>We successfully integrated sophisticated language models into the RL framework, enhancing strategic decision-making processes.
arXiv Detail & Related papers (2025-03-19T22:48:20Z) - From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.<n>We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - HAZARD Challenge: Embodied Decision Making in Dynamically Changing
Environments [93.94020724735199]
HAZARD consists of three unexpected disaster scenarios, including fire, flood, and wind.
This benchmark enables us to evaluate autonomous agents' decision-making capabilities across various pipelines.
arXiv Detail & Related papers (2024-01-23T18:59:43Z) - ERMAS: Becoming Robust to Reward Function Sim-to-Real Gaps in
Multi-Agent Simulations [110.72725220033983]
Epsilon-Robust Multi-Agent Simulation (ERMAS) is a framework for learning AI policies that are robust to such multiagent sim-to-real gaps.
ERMAS learns tax policies that are robust to changes in agent risk aversion, improving social welfare by up to 15% in complextemporal simulations.
In particular, ERMAS learns tax policies that are robust to changes in agent risk aversion, improving social welfare by up to 15% in complextemporal simulations.
arXiv Detail & Related papers (2021-06-10T04:32:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.