Related papers: Generating Piano Practice Policy with a Gaussian Process

Generating Piano Practice Policy with a Gaussian Process

URL: http://arxiv.org/abs/2406.04812v1
Date: Fri, 7 Jun 2024 10:27:07 GMT
Title: Generating Piano Practice Policy with a Gaussian Process
Authors: Alexandra Moringen, Elad Vromen, Helge Ritter, Jason Friedman,
Abstract summary: We present a modeling framework to guide the human learner through the learning process by choosing the practice modes generated by a policy model. The proposed policy model is trained to approximate the expert-learner interaction during a practice session.
Score: 42.41481706562645
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A typical process of learning to play a piece on a piano consists of a progression through a series of practice units that focus on individual dimensions of the skill, the so-called practice modes. Practice modes in learning to play music comprise a particularly large set of possibilities, such as hand coordination, posture, articulation, ability to read a music score, correct timing or pitch, etc. Self-guided practice is known to be suboptimal, and a model that schedules optimal practice to maximize a learner's progress still does not exist. Because we each learn differently and there are many choices for possible piano practice tasks and methods, the set of practice modes should be dynamically adapted to the human learner, a process typically guided by a teacher. However, having a human teacher guide individual practice is not always feasible since it is time-consuming, expensive, and often unavailable. In this work, we present a modeling framework to guide the human learner through the learning process by choosing the practice modes generated by a policy model. To this end, we present a computational architecture building on a Gaussian process that incorporates 1) the learner state, 2) a policy that selects a suitable practice mode, 3) performance evaluation, and 4) expert knowledge. The proposed policy model is trained to approximate the expert-learner interaction during a practice session. In our future work, we will test different Bayesian optimization techniques, e.g., different acquisition functions, and evaluate their effect on the learning progress.

Related papers

VTAO-BiManip: Masked Visual-Tactile-Action Pre-training with Object Understanding for Bimanual Dexterous Manipulation [8.882764358932276]
Bimanual dexterous manipulation remains significant challenges in robotics due to the high DoFs of each hand and their coordination. Existing single-hand manipulation techniques often leverage human demonstrations to guide RL methods but fail to generalize to complex bimanual tasks involving multiple sub-skills. We introduce VTAO-BiManip, a novel framework that combines visual-tactile-action pretraining with object understanding to facilitate curriculum RL to enable human-like bimanual manipulation.
arXiv Detail & Related papers (2025-01-07T08:14:53Z)
A Practitioner's Guide to Continual Multimodal Pretraining [83.63894495064855]
Multimodal foundation models serve numerous applications at the intersection of vision and language. To keep models updated, research into continual pretraining mainly explores scenarios with either infrequent, indiscriminate updates on large-scale new data, or frequent, sample-level updates. We introduce FoMo-in-Flux, a continual multimodal pretraining benchmark with realistic compute constraints and practical deployment requirements.
arXiv Detail & Related papers (2024-08-26T17:59:01Z)
RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning. Our proposed method uses reinforcement learning with user intervention signals themselves as rewards. This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z)
How To Guide Your Learner: Imitation Learning with Active Adaptive Expert Involvement [20.91491585498749]
We propose a novel active imitation learning framework based on a teacher-student interaction model. We show that AdapMen can improve the error bound and avoid compounding error under mild conditions.
arXiv Detail & Related papers (2023-03-03T16:44:33Z)
Large Language Models can Implement Policy Iteration [18.424558160071808]
In-Context Policy Iteration is an algorithm for performing Reinforcement Learning (RL), in-context, using foundation models. ICPI learns to perform RL tasks without expert demonstrations or gradients. ICPI iteratively updates the contents of the prompt from which it derives its policy through trial-and-error interaction with an RL environment.
arXiv Detail & Related papers (2022-10-07T21:18:22Z)
Continual Predictive Learning from Videos [100.27176974654559]
We study a new continual learning problem in the context of video prediction. We propose the continual predictive learning (CPL) approach, which learns a mixture world model via predictive experience replay. We construct two new benchmarks based on RoboNet and KTH, in which different tasks correspond to different physical robotic environments or human actions.
arXiv Detail & Related papers (2022-04-12T08:32:26Z)
Optimizing piano practice with a utility-based scaffold [59.821144959060305]
A typical part of learning to play the piano is the progression through a series of practice units that focus on individual dimensions of the skill. Because we each learn differently, and because there are many choices for possible piano practice tasks and methods, the set of practice tasks should be dynamically adapted to the human learner. We present a modeling framework to guide the human learner through the learning process by choosing practice modes that have the highest expected utility.
arXiv Detail & Related papers (2021-06-21T14:05:00Z)
Interleaving Learning, with Application to Neural Architecture Search [12.317568257671427]
We propose a novel machine learning framework referred to as interleaving learning (IL) In our framework, a set of models collaboratively learn a data encoder in an interleaving fashion. We apply interleaving learning to search neural architectures for image classification on CIFAR-10, CIFAR-100, and ImageNet.
arXiv Detail & Related papers (2021-03-12T00:54:22Z)
Reward-Conditioned Policies [100.64167842905069]
imitation learning requires near-optimal expert data. Can we learn effective policies via supervised learning without demonstrations? We show how such an approach can be derived as a principled method for policy search.
arXiv Detail & Related papers (2019-12-31T18:07:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.