Related papers: LC-Protonets: Multi-label Few-shot learning for world music audio tagging

LC-Protonets: Multi-label Few-shot learning for world music audio tagging

URL: http://arxiv.org/abs/2409.11264v1
Date: Tue, 17 Sep 2024 15:13:07 GMT
Title: LC-Protonets: Multi-label Few-shot learning for world music audio tagging
Authors: Charilaos Papaioannou, Emmanouil Benetos, Alexandros Potamianos,
Abstract summary: We introduce Label-Combination Prototypical Networks (LC-Protonets) to address the problem of multi-label few-shot classification. LC-Protonets generate one prototype per label combination, derived from the power set of labels present in the limited training items. Our method is applied to automatic audio tagging across diverse music datasets, covering various cultures and including both modern and traditional music.
Score: 65.72891334156706
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce Label-Combination Prototypical Networks (LC-Protonets) to address the problem of multi-label few-shot classification, where a model must generalize to new classes based on only a few available examples. Extending Prototypical Networks, LC-Protonets generate one prototype per label combination, derived from the power set of labels present in the limited training items, rather than one prototype per label. Our method is applied to automatic audio tagging across diverse music datasets, covering various cultures and including both modern and traditional music, and is evaluated against existing approaches in the literature. The results demonstrate a significant performance improvement in almost all domains and training setups when using LC-Protonets for multi-label classification. In addition to training a few-shot learning model from scratch, we explore the use of a pre-trained model, obtained via supervised learning, to embed items in the feature space. Fine-tuning improves the generalization ability of all methods, yet LC-Protonets achieve high-level performance even without fine-tuning, in contrast to the comparative approaches. We finally analyze the scalability of the proposed method, providing detailed quantitative metrics from our experiments. The implementation and experimental setup are made publicly available, offering a benchmark for future research.

Related papers

Multiple Stochastic Prompt Tuning for Practical Cross-Domain Few Shot Learning [14.85375816073596]
We propose a cross-domain few-shot learning task, where a large-scale pre-trained model like CLIP can be easily deployed on a target dataset.<n>The goal is to simultaneously classify all unseen classes under extreme domain shifts, by utilizing only a few labeled samples per class.<n>We propose a novel framework, termed MIST (MultIple STochastic Prompt tuning), where multiple prompts are utilized to handle significant domain and semantic shifts.
arXiv Detail & Related papers (2025-06-04T13:18:04Z)
Pre-Trained Vision-Language Models as Partial Annotators [40.89255396643592]
Pre-trained vision-language models learn massive data to model unified representations of images and natural languages. In this paper, we investigate a novel "pre-trained annotating - weakly-supervised learning" paradigm for pre-trained model application and experiment on image classification tasks.
arXiv Detail & Related papers (2024-05-23T17:17:27Z)
Learning New Tasks from a Few Examples with Soft-Label Prototypes [18.363177410917597]
We propose a novel few-shot learning approach based on soft-label prototypes (SLPs) We focus on learning previously unseen NLP tasks from very few examples (4, 8, 16) per class. We experimentally demonstrate that our approach achieves superior performance on the majority of tested tasks in this data-lean setting.
arXiv Detail & Related papers (2022-10-31T16:06:48Z)
Beyond Simple Meta-Learning: Multi-Purpose Models for Multi-Domain, Active and Continual Few-Shot Learning [41.07029317930986]
We propose a variance-sensitive class of models that operates in a low-label regime. The first method, Simple CNAPS, employs a hierarchically regularized Mahalanobis-distance based classifier. We further extend this approach to a transductive learning setting, proposing Transductive CNAPS.
arXiv Detail & Related papers (2022-01-13T18:59:02Z)
Towards General and Efficient Active Learning [20.888364610175987]
Active learning aims to select the most informative samples to exploit limited annotation budgets. We propose a novel general and efficient active learning (GEAL) method in this paper. Our method can conduct data selection processes on different datasets with a single-pass inference of the same model.
arXiv Detail & Related papers (2021-12-15T08:35:28Z)
Partner-Assisted Learning for Few-Shot Image Classification [54.66864961784989]
Few-shot Learning has been studied to mimic human visual capabilities and learn effective models without the need of exhaustive human annotation. In this paper, we focus on the design of training strategy to obtain an elemental representation such that the prototype of each novel class can be estimated from a few labeled samples. We propose a two-stage training scheme, which first trains a partner encoder to model pair-wise similarities and extract features serving as soft-anchors, and then trains a main encoder by aligning its outputs with soft-anchors while attempting to maximize classification performance.
arXiv Detail & Related papers (2021-09-15T22:46:19Z)
Contrastive Prototype Learning with Augmented Embeddings for Few-Shot Learning [58.2091760793799]
We propose a novel contrastive prototype learning with augmented embeddings (CPLAE) model. With a class prototype as an anchor, CPL aims to pull the query samples of the same class closer and those of different classes further away. Extensive experiments on several benchmarks demonstrate that our proposed CPLAE achieves new state-of-the-art.
arXiv Detail & Related papers (2021-01-23T13:22:44Z)
Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings. We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data. We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
An Empirical Study on Large-Scale Multi-Label Text Classification Including Few and Zero-Shot Labels [49.036212158261215]
Large-scale Multi-label Text Classification (LMTC) has a wide range of Natural Language Processing (NLP) applications. Current state-of-the-art LMTC models employ Label-Wise Attention Networks (LWANs) We show that hierarchical methods based on Probabilistic Label Trees (PLTs) outperform LWANs. We propose a new state-of-the-art method which combines BERT with LWANs.
arXiv Detail & Related papers (2020-10-04T18:55:47Z)
UniT: Unified Knowledge Transfer for Any-shot Object Detection and Segmentation [52.487469544343305]
Methods for object detection and segmentation rely on large scale instance-level annotations for training. We propose an intuitive and unified semi-supervised model that is applicable to a range of supervision.
arXiv Detail & Related papers (2020-06-12T22:45:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.