Do Recommender Systems Promote Local Music? A Reproducibility Study Using Music Streaming Data
- URL: http://arxiv.org/abs/2408.16430v1
- Date: Thu, 29 Aug 2024 10:44:59 GMT
- Title: Do Recommender Systems Promote Local Music? A Reproducibility Study Using Music Streaming Data
- Authors: Kristina Matrosova, Lilian Marey, Guillaume Salha-Galvan, Thomas Louail, Olivier Bodini, Manuel Moussallam,
- Abstract summary: This paper examines the influence of recommender systems on local music representation.
Prior study argued that different recommender systems exhibit algorithmic biases shifting music consumption either towards or against local content.
To assess the robustness of this study's conclusions, we conduct a comparative analysis using proprietary listening data from a global music streaming service.
- Score: 3.023348267060456
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper examines the influence of recommender systems on local music representation, discussing prior findings from an empirical study on the LFM-2b public dataset. This prior study argued that different recommender systems exhibit algorithmic biases shifting music consumption either towards or against local content. However, LFM-2b users do not reflect the diverse audience of music streaming services. To assess the robustness of this study's conclusions, we conduct a comparative analysis using proprietary listening data from a global music streaming service, which we publicly release alongside this paper. We observe significant differences in local music consumption patterns between our dataset and LFM-2b, suggesting that caution should be exercised when drawing conclusions on local music based solely on LFM-2b. Moreover, we show that the algorithmic biases exhibited in the original work vary in our dataset, and that several unexplored model parameters can significantly influence these biases and affect the study's conclusion on both datasets. Finally, we discuss the complexity of accurately labeling local music, emphasizing the risk of misleading conclusions due to unreliable, biased, or incomplete labels. To encourage further research and ensure reproducibility, we have publicly shared our dataset and code.
Related papers
- Lazy Data Practices Harm Fairness Research [49.02318458244464]
We present a comprehensive analysis of fair ML datasets, demonstrating how unreflective practices hinder the reach and reliability of algorithmic fairness findings.
Our analyses identify three main areas of concern: (1) a textbflack of representation for certain protected attributes in both data and evaluations; (2) the widespread textbf of minorities during data preprocessing; and (3) textbfopaque data processing threatening the generalization of fairness research.
This study underscores the need for a critical reevaluation of data practices in fair ML and offers directions to improve both the sourcing and usage of datasets.
arXiv Detail & Related papers (2024-04-26T09:51:24Z) - On the Effect of Data-Augmentation on Local Embedding Properties in the
Contrastive Learning of Music Audio Representations [6.255143207183722]
We show that musical properties that are homogeneous within a track are reflected in the locality of neighborhoods in the resulting embedding space.
We show that the optimal selection of data augmentation strategies for contrastive learning of music audio embeddings is dependent on the downstream task.
arXiv Detail & Related papers (2024-01-17T00:12:13Z) - Perceptual Musical Features for Interpretable Audio Tagging [2.1730712607705485]
This study explores the relevance of interpretability in the context of automatic music tagging.
We constructed a workflow that incorporates three different information extraction techniques.
We conducted experiments on two datasets, namely the MTG-Jamendo dataset and the GTZAN dataset.
arXiv Detail & Related papers (2023-12-18T14:31:58Z) - Fairness Through Domain Awareness: Mitigating Popularity Bias For Music
Discovery [56.77435520571752]
We explore the intrinsic relationship between music discovery and popularity bias.
We propose a domain-aware, individual fairness-based approach which addresses popularity bias in graph neural network (GNNs) based recommender systems.
Our approach uses individual fairness to reflect a ground truth listening experience, i.e., if two songs sound similar, this similarity should be reflected in their representations.
arXiv Detail & Related papers (2023-08-28T14:12:25Z) - MARBLE: Music Audio Representation Benchmark for Universal Evaluation [79.25065218663458]
We introduce the Music Audio Representation Benchmark for universaL Evaluation, termed MARBLE.
It aims to provide a benchmark for various Music Information Retrieval (MIR) tasks by defining a comprehensive taxonomy with four hierarchy levels, including acoustic, performance, score, and high-level description.
We then establish a unified protocol based on 14 tasks on 8 public-available datasets, providing a fair and standard assessment of representations of all open-sourced pre-trained models developed on music recordings as baselines.
arXiv Detail & Related papers (2023-06-18T12:56:46Z) - Mitigating Representation Bias in Action Recognition: Algorithms and
Benchmarks [76.35271072704384]
Deep learning models perform poorly when applied to videos with rare scenes or objects.
We tackle this problem from two different angles: algorithm and dataset.
We show that the debiased representation can generalize better when transferred to other datasets and tasks.
arXiv Detail & Related papers (2022-09-20T00:30:35Z) - A Stakeholder-Centered View on Fairness in Music Recommender Systems [5.901337162013615]
The review first outlines current literature on MRS fairness from the perspective of each stakeholder and the stakeholders combined.
The two open questions arising from the review are as follows: (i) In the MRS field, only limited data is publicly available to conduct fairness research; most datasets either originate from the same source or are proprietary.
Overall, the review shows that the large majority of works analyze the current situation of MRS fairness, whereas only few works propose approaches to improve it.
arXiv Detail & Related papers (2022-09-08T16:02:16Z) - dMelodies: A Music Dataset for Disentanglement Learning [70.90415511736089]
We present a new symbolic music dataset that will help researchers demonstrate the efficacy of their algorithms on diverse domains.
This will also provide a means for evaluating algorithms specifically designed for music.
The dataset is large enough (approx. 1.3 million data points) to train and test deep networks for disentanglement learning.
arXiv Detail & Related papers (2020-07-29T19:20:07Z) - Provably Efficient Causal Reinforcement Learning with Confounded
Observational Data [135.64775986546505]
We study how to incorporate the dataset (observational data) collected offline, which is often abundantly available in practice, to improve the sample efficiency in the online setting.
We propose the deconfounded optimistic value iteration (DOVI) algorithm, which incorporates the confounded observational data in a provably efficient manner.
arXiv Detail & Related papers (2020-06-22T14:49:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.