Experimenting active and sequential learning in a medieval music manuscript
- URL: http://arxiv.org/abs/2507.15633v1
- Date: Mon, 21 Jul 2025 13:55:54 GMT
- Title: Experimenting active and sequential learning in a medieval music manuscript
- Authors: Sachin Sharma, Federico Simonetta, Michele Flammini,
- Abstract summary: We present a preliminary study of Active Learning (AL) and Sequential Learning (SL) tailored for object detection and layout recognition in an old medieval music manuscript.<n>Our system selects samples with the highest uncertainty (lowest prediction confidence) for iterative labeling and retraining.<n>We show that in the manuscript at-hand, uncertainty-based AL is not effective and advocates for more usable methods in data-scarcity scenarios.
- Score: 7.790089838318836
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Optical Music Recognition (OMR) is a cornerstone of music digitization initiatives in cultural heritage, yet it remains limited by the scarcity of annotated data and the complexity of historical manuscripts. In this paper, we present a preliminary study of Active Learning (AL) and Sequential Learning (SL) tailored for object detection and layout recognition in an old medieval music manuscript. Leveraging YOLOv8, our system selects samples with the highest uncertainty (lowest prediction confidence) for iterative labeling and retraining. Our approach starts with a single annotated image and successfully boosts performance while minimizing manual labeling. Experimental results indicate that comparable accuracy to fully supervised training can be achieved with significantly fewer labeled examples. We test the methodology as a preliminary investigation on a novel dataset offered to the community by the Anonymous project, which studies laude, a poetical-musical genre spread across Italy during the 12th-16th Century. We show that in the manuscript at-hand, uncertainty-based AL is not effective and advocates for more usable methods in data-scarcity scenarios.
Related papers
- NER4all or Context is All You Need: Using LLMs for low-effort, high-performance NER on historical texts. A humanities informed approach [0.03187482513047917]
We show how readily-available, state-of-the-art LLMs significantly outperform two leading NLP frameworks for NER in historical documents.<n>Our approach democratises access to NER for all historians by removing the barrier of scripting languages and computational skills required for established NLP tools.
arXiv Detail & Related papers (2025-02-04T16:54:23Z) - Low-Data Classification of Historical Music Manuscripts: A Few-Shot Learning Approach [0.0]
We develop a self-supervised learning framework for the classification of musical symbols in historical manuscripts.
We overcome this challenge by training a neural-based feature extractor on unlabelled data, enabling effective classification with minimal samples.
arXiv Detail & Related papers (2024-11-25T14:14:25Z) - LC-Protonets: Multi-Label Few-Shot Learning for World Music Audio Tagging [65.72891334156706]
We introduce Label-Combination Prototypical Networks (LC-Protonets) to address the problem of multi-label few-shot classification.<n> LC-Protonets generate one prototype per label combination, derived from the power set of labels present in the limited training items.<n>Our method is applied to automatic audio tagging across diverse music datasets, covering various cultures and including both modern and traditional music.
arXiv Detail & Related papers (2024-09-17T15:13:07Z) - Towards Explainable and Interpretable Musical Difficulty Estimation: A Parameter-efficient Approach [49.2787113554916]
Estimating music piece difficulty is important for organizing educational music collections.
Our work employs explainable descriptors for difficulty estimation in symbolic music representations.
Our approach, evaluated in piano repertoire categorized in 9 classes, achieved 41.4% accuracy independently, with a mean squared error (MSE) of 1.7.
arXiv Detail & Related papers (2024-08-01T11:23:42Z) - Co-training for Low Resource Scientific Natural Language Inference [65.37685198688538]
We propose a novel co-training method that assigns weights based on the training dynamics of the classifiers to the distantly supervised labels.
By assigning importance weights instead of filtering out examples based on an arbitrary threshold on the predicted confidence, we maximize the usage of automatically labeled data.
The proposed method obtains an improvement of 1.5% in Macro F1 over the distant supervision baseline, and substantial improvements over several other strong SSL baselines.
arXiv Detail & Related papers (2024-06-20T18:35:47Z) - Self-supervised Pre-training of Text Recognizers [0.0]
We study self-supervised pre-training methods based on masked label prediction.
We perform experiments on historical handwritten (Bentham) and historical printed datasets.
The evaluation shows that the self-supervised pre-training on data from the target domain is very effective, but it struggles to outperform transfer learning from closely related domains.
arXiv Detail & Related papers (2024-05-01T09:58:57Z) - VGTS: Visually Guided Text Spotting for Novel Categories in Historical Manuscripts [26.09365732823049]
We propose a Visually Guided Text Spotting (VGTS) approach that accurately spots novel characters using just one annotated support sample.
The DSA block aims to identify, focus on, and learn discriminative spatial regions in the support and query images, mimicking the human visual spotting process.
To tackle the example imbalance problem in low-resource spotting tasks, we develop a novel torus loss function that enhances the discriminative power of the embedding space for distance metric learning.
arXiv Detail & Related papers (2023-04-03T06:40:52Z) - CSSL-MHTR: Continual Self-Supervised Learning for Scalable Multi-script Handwritten Text Recognition [16.987008461171065]
We explore the potential of continual self-supervised learning to alleviate the catastrophic forgetting problem in handwritten text recognition.
Our method consists in adding intermediate layers called adapters for each task, and efficiently distilling knowledge from the previous model while learning the current task.
We attain state-of-the-art performance on English, Italian and Russian scripts, whilst adding only a few parameters per task.
arXiv Detail & Related papers (2023-03-16T14:27:45Z) - PromptCAL: Contrastive Affinity Learning via Auxiliary Prompts for
Generalized Novel Category Discovery [39.03732147384566]
Generalized Novel Category Discovery (GNCD) setting aims to categorize unlabeled training data coming from known and novel classes.
We propose Contrastive Affinity Learning method with auxiliary visual Prompts, dubbed PromptCAL, to address this challenging problem.
Our approach discovers reliable pairwise sample affinities to learn better semantic clustering of both known and novel classes for the class token and visual prompts.
arXiv Detail & Related papers (2022-12-11T20:06:14Z) - Annotation Error Detection: Analyzing the Past and Present for a More
Coherent Future [63.99570204416711]
We reimplement 18 methods for detecting potential annotation errors and evaluate them on 9 English datasets.
We define a uniform evaluation setup including a new formalization of the annotation error detection task.
We release our datasets and implementations in an easy-to-use and open source software package.
arXiv Detail & Related papers (2022-06-05T22:31:45Z) - Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for
Open-Set Semi-Supervised Learning [101.28281124670647]
Open-set semi-supervised learning (open-set SSL) investigates a challenging but practical scenario where out-of-distribution (OOD) samples are contained in the unlabeled data.
We propose a novel training mechanism that could effectively exploit the presence of OOD data for enhanced feature learning.
Our approach substantially lifts the performance on open-set SSL and outperforms the state-of-the-art by a large margin.
arXiv Detail & Related papers (2021-08-12T09:14:44Z) - Confident Coreset for Active Learning in Medical Image Analysis [57.436224561482966]
We propose a novel active learning method, confident coreset, which considers both uncertainty and distribution for effectively selecting informative samples.
By comparative experiments on two medical image analysis tasks, we show that our method outperforms other active learning methods.
arXiv Detail & Related papers (2020-04-05T13:46:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.