Related papers: Efficient Mitigation of Bus Bunching through Setter-Based Curriculum Learning

Efficient Mitigation of Bus Bunching through Setter-Based Curriculum Learning

URL: http://arxiv.org/abs/2405.15824v1
Date: Thu, 23 May 2024 18:26:55 GMT
Title: Efficient Mitigation of Bus Bunching through Setter-Based Curriculum Learning
Authors: Avidan Shah, Danny Tran, Yuhan Tang,
Abstract summary: We propose a novel approach to curriculum learning that uses a Setter Model to automatically generate an action space, adversary strength, and bunching strength. Our method for automated curriculum learning involves a curriculum that is dynamically chosen and learned by an adversary network.
Score: 0.47518865271427785
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Curriculum learning has been growing in the domain of reinforcement learning as a method of improving training efficiency for various tasks. It involves modifying the difficulty (lessons) of the environment as the agent learns, in order to encourage more optimal agent behavior and higher reward states. However, most curriculum learning methods currently involve discrete transitions of the curriculum or predefined steps by the programmer or using automatic curriculum learning on only a small subset training such as only on an adversary. In this paper, we propose a novel approach to curriculum learning that uses a Setter Model to automatically generate an action space, adversary strength, initialization, and bunching strength. Transportation and traffic optimization is a well known area of study, especially for reinforcement learning based solutions. We specifically look at the bus bunching problem for the context of this study. The main idea of the problem is to minimize the delays caused by inefficient bus timings for passengers arriving and departing from a system of buses. While the heavy exploration in the area makes innovation and improvement with regards to performance marginal, it simultaneously provides an effective baseline for developing new generalized techniques. Our group is particularly interested in examining curriculum learning and its effect on training efficiency and overall performance. We decide to try a lesser known approach to curriculum learning, in which the curriculum is not fixed or discretely thresholded. Our method for automated curriculum learning involves a curriculum that is dynamically chosen and learned by an adversary network made to increase the difficulty of the agent's training, and defined by multiple forms of input. Our results are shown in the following sections of this paper.

Related papers

Accelerated Online Reinforcement Learning using Auxiliary Start State Distributions [50.44719434877687]
Expert demonstrations and simulators can reset to arbitrary states.<n>We find that using a notion of safety to inform the choice of this auxiliary distribution significantly accelerates learning.
arXiv Detail & Related papers (2025-07-07T01:54:05Z)
Self-Evolving Curriculum for LLM Reasoning [108.23021254812258]
Self-Evolving Curriculum (SEC) is an automatic curriculum learning method that learns a curriculum policy concurrently with the RL fine-tuning process.<n>Our experiments demonstrate that SEC significantly improves models' reasoning capabilities, enabling better generalization to harder, out-of-distribution test problems.
arXiv Detail & Related papers (2025-05-20T23:17:15Z)
Training a Generally Curious Agent [86.84089201249104]
Paprika is a fine-tuning approach that enables language models to develop general decision-making capabilities.<n>Paprika teaches models to explore and adapt their behavior on a new task based on environment feedback in-context without more gradient updates.<n>Results suggest a promising path towards AI systems that can autonomously solve sequential decision-making problems.
arXiv Detail & Related papers (2025-02-24T18:56:58Z)
Reverse Forward Curriculum Learning for Extreme Sample and Demonstration Efficiency in Reinforcement Learning [17.092640837991883]
Reinforcement learning (RL) presents a promising framework to learn policies through environment interaction. One direction includes augmenting RL with offline data demonstrating desired tasks, but past work often require a lot of high-quality demonstration data. We show how the combination of a reverse curriculum and forward curriculum in our method, RFCL, enables significant improvements in demonstration and sample efficiency.
arXiv Detail & Related papers (2024-05-06T11:33:12Z)
Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning. Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy. Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z)
When Do Curricula Work in Federated Learning? [56.88941905240137]
We find that curriculum learning largely alleviates non-IIDness. The more disparate the data distributions across clients the more they benefit from learning. We propose a novel client selection technique that benefits from the real-world disparity in the clients.
arXiv Detail & Related papers (2022-12-24T11:02:35Z)
Teacher-student curriculum learning for reinforcement learning [1.7259824817932292]
Reinforcement learning (rl) is a popular paradigm for sequential decision making problems. The sample inefficiency of deep reinforcement learning methods is a significant obstacle when applying rl to real-world problems. We propose a teacher-student curriculum learning setting where we simultaneously train a teacher that selects tasks for the student while the student learns how to solve the selected task.
arXiv Detail & Related papers (2022-10-31T14:45:39Z)
MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning [65.52675802289775]
We show that an uncertainty aware classifier can solve challenging reinforcement learning problems. We propose a novel method for computing the normalized maximum likelihood (NML) distribution. We show that the resulting algorithm has a number of intriguing connections to both count-based exploration methods and prior algorithms for learning reward functions.
arXiv Detail & Related papers (2021-07-15T08:19:57Z)
Multi-task curriculum learning in a complex, visual, hard-exploration domain: Minecraft [18.845438529816004]
We explore curriculum learning in a complex, visual domain with many hard exploration challenges: Minecraft. We find that learning progress is a reliable measure of learnability for automatically constructing an effective curriculum.
arXiv Detail & Related papers (2021-06-28T17:50:40Z)
An Analytical Theory of Curriculum Learning in Teacher-Student Networks [10.303947049948107]
In humans and animals, curriculum learning is critical to rapid learning and effective pedagogy. In machine learning, curricula are not widely used and empirically often yield only moderate benefits.
arXiv Detail & Related papers (2021-06-15T11:48:52Z)
Curriculum Learning: A Survey [65.31516318260759]
Curriculum learning strategies have been successfully employed in all areas of machine learning. We construct a taxonomy of curriculum learning approaches by hand, considering various classification criteria. We build a hierarchical tree of curriculum learning methods using an agglomerative clustering algorithm.
arXiv Detail & Related papers (2021-01-25T20:08:32Z)
Learning by Ignoring, with Application to Domain Adaptation [10.426533624387305]
We propose a novel machine learning framework referred to as learning by ignoring (LBI) Our framework automatically identifies pretraining data examples that have large domain shift from the target distribution by learning an ignoring variable for each example and excludes them from the pretraining process. A gradient-based algorithm is developed to efficiently solve the three-level optimization problem in LBI.
arXiv Detail & Related papers (2020-12-28T15:33:41Z)
Importance Weighted Policy Learning and Adaptation [89.46467771037054]
We study a complementary approach which is conceptually simple, general, modular and built on top of recent improvements in off-policy learning. The framework is inspired by ideas from the probabilistic inference literature and combines robust off-policy learning with a behavior prior. Our approach achieves competitive adaptation performance on hold-out tasks compared to meta reinforcement learning baselines and can scale to complex sparse-reward scenarios.
arXiv Detail & Related papers (2020-09-10T14:16:58Z)
Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey [53.73359052511171]
Reinforcement learning (RL) is a popular paradigm for addressing sequential decision tasks in which the agent has only limited environmental feedback. We present a framework for curriculum learning (CL) in RL, and use it to survey and classify existing CL methods in terms of their assumptions, capabilities, and goals.
arXiv Detail & Related papers (2020-03-10T20:41:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.