At Your Fingertips: Extracting Piano Fingering Instructions from Videos
- URL: http://arxiv.org/abs/2303.03745v1
- Date: Tue, 7 Mar 2023 09:09:13 GMT
- Title: At Your Fingertips: Extracting Piano Fingering Instructions from Videos
- Authors: Amit Moryossef, Yanai Elazar, and Yoav Goldberg
- Abstract summary: We consider the AI task of automating the extraction of fingering information from videos.
We show how to perform this task with high-accuracy using a combination of deep-learning modules.
We run the resulting system on 90 videos, resulting in high-quality piano fingering information of 150K notes.
- Score: 45.643494669796866
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Piano fingering -- knowing which finger to use to play each note in a musical
piece, is a hard and important skill to master when learning to play the piano.
While some sheet music is available with expert-annotated fingering
information, most pieces lack this information, and people often resort to
learning the fingering from demonstrations in online videos. We consider the AI
task of automating the extraction of fingering information from videos. This is
a non-trivial task as fingers are often occluded by other fingers, and it is
often not clear from the video which of the keys were pressed, requiring the
synchronization of hand position information and knowledge about the notes that
were played. We show how to perform this task with high-accuracy using a
combination of deep-learning modules, including a GAN-based approach for
fine-tuning on out-of-domain data. We extract the fingering information with an
f1 score of 97\%. We run the resulting system on 90 videos, resulting in
high-quality piano fingering information of 150K notes, the largest available
dataset of piano-fingering to date.
Related papers
- FürElise: Capturing and Physically Synthesizing Hand Motions of Piano Performance [15.909113091360206]
Hand motion models with the sophistication to accurately recreate piano playing have a wide range of applications in character animation, embodied AI, biomechanics, and VR/AR.
In this paper, we construct a first-of-its-kind large-scale dataset that contains approximately 10 hours of 3D hand motion and audio from 15 elite-level pianists playing 153 pieces of classical music.
arXiv Detail & Related papers (2024-10-08T08:21:05Z) - RP1M: A Large-Scale Motion Dataset for Piano Playing with Bi-Manual Dexterous Robot Hands [57.64308229980045]
We introduce the Robot Piano 1 Million dataset, containing bi-manual robot piano playing motion data of more than one million trajectories.
We formulate finger placements as an optimal transport problem, thus, enabling automatic annotation of vast amounts of unlabeled songs.
Benchmarking existing imitation learning approaches shows that such approaches reach state-of-the-art robot piano playing performance by leveraging RP1M.
arXiv Detail & Related papers (2024-08-20T17:56:52Z) - PianoMotion10M: Dataset and Benchmark for Hand Motion Generation in Piano Performance [15.21347897534943]
We construct a piano-hand motion generation benchmark to guide hand movements and fingerings for piano playing.
To this end, we collect an annotated dataset, PianoMotion10M, consisting of 116 hours of piano playing videos from a bird's-eye view with 10 million annotated hand poses.
arXiv Detail & Related papers (2024-06-13T17:05:23Z) - Modeling Bends in Popular Music Guitar Tablatures [49.64902130083662]
Tablature notation is widely used in popular music to transcribe and share guitar musical content.
This paper focuses on bends, which enable to progressively shift the pitch of a note, therefore circumventing physical limitations of the discrete fretted fingerboard.
Experiments are performed on a corpus of 932 lead guitar tablatures of popular music and show that a decision tree successfully predicts bend occurrences with an F1 score of 0.71 anda limited amount of false positive predictions.
arXiv Detail & Related papers (2023-08-22T07:50:58Z) - Video-Mined Task Graphs for Keystep Recognition in Instructional Videos [71.16703750980143]
Procedural activity understanding requires perceiving human actions in terms of a broader task.
We propose discovering a task graph automatically from how-to videos to represent probabilistically how people tend to execute keysteps.
We show the impact: more reliable zero-shot keystep localization and improved video representation learning.
arXiv Detail & Related papers (2023-07-17T18:19:36Z) - Visual motion analysis of the player's finger [3.299672391663527]
This work is about the extraction of the motion of fingers, in their three articulations, of a keyboard player from a video sequence.
The relevance of the problem involves several aspects, in fact, the extraction of the movements of the fingers may be used to compute the keystroke efficiency and individual joint contributions.
arXiv Detail & Related papers (2023-02-24T10:14:13Z) - Towards Learning to Play Piano with Dexterous Hands and Touch [79.48656721563795]
We demonstrate how an agent can learn directly from machine-readable music score to play the piano with dexterous hands on a simulated piano.
We achieve this by using a touch-augmented reward and a novel curriculum of tasks.
arXiv Detail & Related papers (2021-06-03T17:59:31Z) - Music Gesture for Visual Sound Separation [121.36275456396075]
"Music Gesture" is a keypoint-based structured representation to explicitly model the body and finger movements of musicians when they perform music.
We first adopt a context-aware graph network to integrate visual semantic context with body dynamics, and then apply an audio-visual fusion model to associate body movements with the corresponding audio signals.
arXiv Detail & Related papers (2020-04-20T17:53:46Z) - Machine Learning for a Music Glove Instrument [0.0]
Music glove instrument equipped with force sensitive, flex and IMU sensors is trained on an electric piano to learn note sequences.
The glove is used on any surface to generate the sequence of notes most closely related to the hand motion.
arXiv Detail & Related papers (2020-01-27T01:08:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.