Related papers: Agent-State Construction with Auxiliary Inputs

Agent-State Construction with Auxiliary Inputs

URL: http://arxiv.org/abs/2211.07805v3
Date: Fri, 5 May 2023 22:03:24 GMT
Title: Agent-State Construction with Auxiliary Inputs
Authors: Ruo Yu Tao, Adam White, Marlos C. Machado
Abstract summary: We present a series of examples illustrating the different ways of using auxiliary inputs for reinforcement learning. We show that these auxiliary inputs can be used to discriminate between observations that would otherwise be aliased. This approach is complementary to state-of-the-art methods such as recurrent neural networks and truncated back-propagation.
Score: 16.79847469127811
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In many, if not every realistic sequential decision-making task, the decision-making agent is not able to model the full complexity of the world. The environment is often much larger and more complex than the agent, a setting also known as partial observability. In such settings, the agent must leverage more than just the current sensory inputs; it must construct an agent state that summarizes previous interactions with the world. Currently, a popular approach for tackling this problem is to learn the agent-state function via a recurrent network from the agent's sensory stream as input. Many impressive reinforcement learning applications have instead relied on environment-specific functions to aid the agent's inputs for history summarization. These augmentations are done in multiple ways, from simple approaches like concatenating observations to more complex ones such as uncertainty estimates. Although ubiquitous in the field, these additional inputs, which we term auxiliary inputs, are rarely emphasized, and it is not clear what their role or impact is. In this work we explore this idea further, and relate these auxiliary inputs to prior classic approaches to state construction. We present a series of examples illustrating the different ways of using auxiliary inputs for reinforcement learning. We show that these auxiliary inputs can be used to discriminate between observations that would otherwise be aliased, leading to more expressive features that smoothly interpolate between different states. Finally, we show that this approach is complementary to state-of-the-art methods such as recurrent neural networks and truncated back-propagation through time, and acts as a heuristic that facilitates longer temporal credit assignment, leading to better performance.

Related papers

Graphs Meet AI Agents: Taxonomy, Progress, and Future Opportunities [117.49715661395294]
Data structurization can play a promising role by transforming intricate and disorganized data into well-structured forms.<n>This survey presents a first systematic review of how graphs can empower AI agents.
arXiv Detail & Related papers (2025-06-22T12:59:12Z)
Memento No More: Coaching AI Agents to Master Multiple Tasks via Hints Internalization [56.674356045200696]
We propose a novel method to train AI agents to incorporate knowledge and skills for multiple tasks without the need for cumbersome note systems or prior high-quality demonstration data. Our approach employs an iterative process where the agent collects new experiences, receives corrective feedback from humans in the form of hints, and integrates this feedback into its weights. We demonstrate the efficacy of our approach by implementing it in a Llama-3-based agent which, after only a few rounds of feedback, outperforms advanced models GPT-4o and DeepSeek-V3 in a taskset.
arXiv Detail & Related papers (2025-02-03T17:45:46Z)
Perspectives for Direct Interpretability in Multi-Agent Deep Reinforcement Learning [0.41783829807634765]
Multi-Agent Deep Reinforcement Learning (MADRL) was proven efficient in solving complex problems in robotics or games. This paper advocates for direct interpretability, generating post hoc explanations directly from trained models. We explore modern methods, including relevance backpropagation, knowledge edition, model steering, activation patching, sparse autoencoders and circuit discovery.
arXiv Detail & Related papers (2025-02-02T09:15:27Z)
Scaling Large Language Model-based Multi-Agent Collaboration [72.8998796426346]
Recent breakthroughs in large language model-driven autonomous agents have revealed that multi-agent collaboration often surpasses each individual through collective reasoning. This study explores whether the continuous addition of collaborative agents can yield similar benefits.
arXiv Detail & Related papers (2024-06-11T11:02:04Z)
Sim-to-Real Causal Transfer: A Metric Learning Approach to Causally-Aware Interaction Representations [62.48505112245388]
We take an in-depth look at the causal awareness of modern representations of agent interactions. We show that recent representations are already partially resilient to perturbations of non-causal agents. We propose a metric learning approach that regularizes latent representations with causal annotations.
arXiv Detail & Related papers (2023-12-07T18:57:03Z)
What's in a Prior? Learned Proximal Networks for Inverse Problems [9.934876060237345]
Proximal operators are ubiquitous in inverse problems, commonly appearing as part of strategies to regularize problems that are otherwise ill-posed. Modern deep learning models have been brought to bear for these tasks too, as in the framework of plug-and-play or deep unrolling.
arXiv Detail & Related papers (2023-10-22T16:31:01Z)
Rotating Features for Object Discovery [74.1465486264609]
We present Rotating Features, a generalization of complex-valued features to higher dimensions, and a new evaluation procedure for extracting objects from distributed representations. Together, these advancements enable us to scale distributed object-centric representations from simple toy to real-world data.
arXiv Detail & Related papers (2023-06-01T12:16:26Z)
Revisiting Modality Imbalance In Multimodal Pedestrian Detection [6.7841188753203046]
We introduce a novel training setup with regularizer in the multimodal architecture to resolve the problem of this disparity between the modalities. Specifically, our regularizer term helps to make the feature fusion method more robust by considering both the feature extractors equivalently important during the training.
arXiv Detail & Related papers (2023-02-24T11:56:57Z)
Improving Out-of-Distribution Generalization of Neural Rerankers with Contextualized Late Interaction [52.63663547523033]
Late interaction, the simplest form of multi-vector, is also helpful to neural rerankers that only use the [] vector to compute the similarity score. We show that the finding is consistent across different model sizes and first-stage retrievers of diverse natures.
arXiv Detail & Related papers (2023-02-13T18:42:17Z)
On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task. Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z)
PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning. We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z)
ASCII: ASsisted Classification with Ignorance Interchange [17.413989127493622]
We propose a method named ASCII for an agent to improve its classification performance through assistance from other agents. The main idea is to iteratively interchange an ignorance value between 0 and 1 for each collated sample among agents. The method is naturally suitable for privacy-aware, transmission-economical, and decentralized learning scenarios.
arXiv Detail & Related papers (2020-10-21T03:57:36Z)
Self-Attention Attribution: Interpreting Information Interactions Inside Transformer [89.21584915290319]
We propose a self-attention attribution method to interpret the information interactions inside Transformer. We show that the attribution results can be used as adversarial patterns to implement non-targeted attacks towards BERT.
arXiv Detail & Related papers (2020-04-23T14:58:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.