Related papers: Modulation of temporal decision-making in a deep reinforcement learning agent under the dual-task paradigm

Modulation of temporal decision-making in a deep reinforcement learning agent under the dual-task paradigm

URL: http://arxiv.org/abs/2511.01415v1
Date: Mon, 03 Nov 2025 10:09:55 GMT
Title: Modulation of temporal decision-making in a deep reinforcement learning agent under the dual-task paradigm
Authors: Amrapali Pednekar, Álvaro Garrido-Pérez, Yara Khaluf, Pieter Simoens,
Abstract summary: This study explores the interference in temporal processing within a dual-task paradigm from an artificial intelligence (AI) perspective.<n>Two deep reinforcement learning (DRL) agents were separately trained for each of these tasks.<n>These agents exhibited emergent behavior consistent with human timing research.
Score: 4.398170461093707
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This study explores the interference in temporal processing within a dual-task paradigm from an artificial intelligence (AI) perspective. In this context, the dual-task setup is implemented as a simplified version of the Overcooked environment with two variations, single task (T) and dual task (T+N). Both variations involve an embedded time production task, but the dual task (T+N) additionally involves a concurrent number comparison task. Two deep reinforcement learning (DRL) agents were separately trained for each of these tasks. These agents exhibited emergent behavior consistent with human timing research. Specifically, the dual task (T+N) agent exhibited significant overproduction of time relative to its single task (T) counterpart. This result was consistent across four target durations. Preliminary analysis of neural dynamics in the agents' LSTM layers did not reveal any clear evidence of a dedicated or intrinsic timer. Hence, further investigation is needed to better understand the underlying time-keeping mechanisms of the agents and to provide insights into the observed behavioral patterns. This study is a small step towards exploring parallels between emergent DRL behavior and behavior observed in biological systems in order to facilitate a better understanding of both.

Related papers

TempR1: Improving Temporal Understanding of MLLMs via Temporal-Aware Multi-Task Reinforcement Learning [25.848638804759872]
Enhancing the temporal understanding of Multimodal Large Language Models (MLLMs) is essential for advancing long-form video analysis.<n>We present TempR1, a temporal-aware multi-task reinforcement learning framework that systematically strengthens MLLMs' temporal comprehension.
arXiv Detail & Related papers (2025-12-03T16:57:00Z)
Emergent time-keeping mechanisms in a deep reinforcement learning agent performing an interval timing task [4.398170461093707]
We investigate temporal processing in a Deep Reinforcement Learning (DRL) agent performing an interval timing task.<n>Analysis of the agent's internal states revealed neural activations, a ubiquitous pattern in biological systems.<n> Parallels are drawn between the agent's time-keeping strategy and the Striatal Beat Frequency (SBF) model.
arXiv Detail & Related papers (2025-08-06T13:56:41Z)
A Two-Stage Learning-to-Defer Approach for Multi-Task Learning [3.4289478404209826]
We introduce a novel Two-Stage L2D framework for multi-task learning that integrates classification and regression through a unified deferral mechanism.<n>Our method leverages a two-stage surrogate loss family, which we prove to be both Bayes-consistent and $(mathcalG, mathcalR)$-consistent.
arXiv Detail & Related papers (2024-10-21T07:44:57Z)
TC-LIF: A Two-Compartment Spiking Neuron Model for Long-Term Sequential Modelling [54.97005925277638]
The identification of sensory cues associated with potential opportunities and dangers is frequently complicated by unrelated events that separate useful cues by long delays. It remains a challenging task for state-of-the-art spiking neural networks (SNNs) to establish long-term temporal dependency between distant cues. We propose a novel biologically inspired Two-Compartment Leaky Integrate-and-Fire spiking neuron model, dubbed TC-LIF.
arXiv Detail & Related papers (2023-08-25T08:54:41Z)
Learning Dynamics and Generalization in Reinforcement Learning [59.530058000689884]
We show theoretically that temporal difference learning encourages agents to fit non-smooth components of the value function early in training. We show that neural networks trained using temporal difference algorithms on dense reward tasks exhibit weaker generalization between states than randomly networks and gradient networks trained with policy methods.
arXiv Detail & Related papers (2022-06-05T08:49:16Z)
Task-Agnostic Continual Reinforcement Learning: Gaining Insights and Overcoming Challenges [27.474011433615317]
Continual learning (CL) enables the development of models and agents that learn from a sequence of tasks. We investigate the factors that contribute to the performance differences between task-agnostic CL and multi-task (MTL) agents.
arXiv Detail & Related papers (2022-05-28T17:59:00Z)
LDSA: Learning Dynamic Subtask Assignment in Cooperative Multi-Agent Reinforcement Learning [122.47938710284784]
We propose a novel framework for learning dynamic subtask assignment (LDSA) in cooperative MARL. To reasonably assign agents to different subtasks, we propose an ability-based subtask selection strategy. We show that LDSA learns reasonable and effective subtask assignment for better collaboration.
arXiv Detail & Related papers (2022-05-05T10:46:16Z)
On the relationship between disentanglement and multi-task learning [62.997667081978825]
We take a closer look at the relationship between disentanglement and multi-task learning based on hard parameter sharing. We show that disentanglement appears naturally during the process of multi-task neural network training.
arXiv Detail & Related papers (2021-10-07T14:35:34Z)
Reparameterizing Convolutions for Incremental Multi-Task Learning without Task Interference [75.95287293847697]
Two common challenges in developing multi-task models are often overlooked in literature. First, enabling the model to be inherently incremental, continuously incorporating information from new tasks without forgetting the previously learned ones (incremental learning) Second, eliminating adverse interactions amongst tasks, which has been shown to significantly degrade the single-task performance in a multi-task setup (task interference)
arXiv Detail & Related papers (2020-07-24T14:44:46Z)
Meta Reinforcement Learning with Autonomous Inference of Subtask Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph. Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference. Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.