ByteCover: Cover Song Identification via Multi-Loss Training
- URL: http://arxiv.org/abs/2010.14022v2
- Date: Fri, 23 Apr 2021 06:30:03 GMT
- Title: ByteCover: Cover Song Identification via Multi-Loss Training
- Authors: Xingjian Du, Zhesong Yu, Bilei Zhu, Xiaoou Chen, Zejun Ma
- Abstract summary: ByteCover is a new feature learning method for cover song identification (CSI)
Two major improvements are designed to further enhance the capability of the model for CSI.
A set of experiments demonstrated the effectiveness and efficiency of ByteCover on multiple datasets.
- Score: 20.215501383270706
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present in this paper ByteCover, which is a new feature learning method
for cover song identification (CSI). ByteCover is built based on the classical
ResNet model, and two major improvements are designed to further enhance the
capability of the model for CSI. In the first improvement, we introduce the
integration of instance normalization (IN) and batch normalization (BN) to
build IBN blocks, which are major components of our ResNet-IBN model. With the
help of the IBN blocks, our CSI model can learn features that are invariant to
the changes of musical attributes such as key, tempo, timbre and genre, while
preserving the version information. In the second improvement, we employ the
BNNeck method to allow a multi-loss training and encourage our method to
jointly optimize a classification loss and a triplet loss, and by this means,
the inter-class discrimination and intra-class compactness of cover songs, can
be ensured at the same time. A set of experiments demonstrated the
effectiveness and efficiency of ByteCover on multiple datasets, and in the
Da-TACOS dataset, ByteCover outperformed the best competitive system by 20.9\%.
Related papers
- UIFormer: A Unified Transformer-based Framework for Incremental Few-Shot Object Detection and Instance Segmentation [38.331860053615955]
This paper introduces a novel framework for unified incremental few-shot object detection (iFSOD) and instance segmentation (iFSIS) using the Transformer architecture.
Our goal is to create an optimal solution for situations where only a few examples of novel object classes are available.
arXiv Detail & Related papers (2024-11-13T12:29:44Z) - BDC-Occ: Binarized Deep Convolution Unit For Binarized Occupancy Network [55.21288428359509]
Existing 3D occupancy networks demand significant hardware resources, hindering the deployment of edge devices.
We propose a novel binarized deep convolution (BDC) unit that effectively enhances performance while increasing the number of binarized convolutional layers.
Our BDC-Occ model is created by applying the proposed BDC unit to binarize the existing 3D occupancy networks.
arXiv Detail & Related papers (2024-05-27T10:44:05Z) - A New Learning Paradigm for Foundation Model-based Remote Sensing Change
Detection [54.01158175996638]
Change detection (CD) is a critical task to observe and analyze dynamic processes of land cover.
We propose a Bi-Temporal Adapter Network (BAN), which is a universal foundation model-based CD adaptation framework.
arXiv Detail & Related papers (2023-12-02T15:57:17Z) - Unified Batch Normalization: Identifying and Alleviating the Feature
Condensation in Batch Normalization and a Unified Framework [55.22949690864962]
Batch Normalization (BN) has become an essential technique in contemporary neural network design.
We propose a two-stage unified framework called Unified Batch Normalization (UBN)
UBN significantly enhances performance across different visual backbones and different vision tasks.
arXiv Detail & Related papers (2023-11-27T16:41:31Z) - CoverHunter: Cover Song Identification with Refined Attention and
Alignments [19.173689175634106]
Cover song identification (CSI) focuses on finding the same music with different versions in reference anchors given a query track.
We propose a novel system named CoverHunter that overcomes the shortcomings of existing detection schemes.
arXiv Detail & Related papers (2023-06-15T10:34:20Z) - Bi-directional Feature Reconstruction Network for Fine-Grained Few-Shot
Image Classification [61.411869453639845]
We introduce a bi-reconstruction mechanism that can simultaneously accommodate for inter-class and intra-class variations.
This design effectively helps the model to explore more subtle and discriminative features.
Experimental results on three widely used fine-grained image classification datasets consistently show considerable improvements.
arXiv Detail & Related papers (2022-11-30T16:55:14Z) - Content-aware Scalable Deep Compressed Sensing [8.865549833627794]
We present a novel content-aware scalable network dubbed CASNet to address image compressed sensing problems.
We first adopt a data-driven saliency detector to evaluate the importances of different image regions and propose a saliency-based block ratio aggregation (BRA) strategy for sampling rate allocation.
To accelerate training convergence and improve network robustness, we propose an SVD-based scheme and a random transformation enhancement (RTE) strategy.
arXiv Detail & Related papers (2022-07-19T14:59:14Z) - Deep Attention-guided Graph Clustering with Dual Self-supervision [49.040136530379094]
We propose a novel method, namely deep attention-guided graph clustering with dual self-supervision (DAGC)
We develop a dual self-supervision solution consisting of a soft self-supervision strategy with a triplet Kullback-Leibler divergence loss and a hard self-supervision strategy with a pseudo supervision loss.
Our method consistently outperforms state-of-the-art methods on six benchmark datasets.
arXiv Detail & Related papers (2021-11-10T06:53:03Z) - AutoBERT-Zero: Evolving BERT Backbone from Scratch [94.89102524181986]
We propose an Operation-Priority Neural Architecture Search (OP-NAS) algorithm to automatically search for promising hybrid backbone architectures.
We optimize both the search algorithm and evaluation of candidate models to boost the efficiency of our proposed OP-NAS.
Experiments show that the searched architecture (named AutoBERT-Zero) significantly outperforms BERT and its variants of different model capacities in various downstream tasks.
arXiv Detail & Related papers (2021-07-15T16:46:01Z) - Generalized Reinforcement Meta Learning for Few-Shot Optimization [3.7675996866306845]
We present a generic and flexible Reinforcement Learning (RL) based meta-learning framework for the problem of few-shot learning.
Our framework could be easily extended to do network architecture search.
arXiv Detail & Related papers (2020-05-04T03:21:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.