SpotHitPy: A Study For ML-Based Song Hit Prediction Using Spotify
- URL: http://arxiv.org/abs/2301.07978v1
- Date: Thu, 19 Jan 2023 10:13:52 GMT
- Title: SpotHitPy: A Study For ML-Based Song Hit Prediction Using Spotify
- Authors: Ioannis Dimolitsas, Spyridon Kantarelis, Afroditi Fouka
- Abstract summary: We gathered a dataset of nearly 18500 hit and non-hit songs.
We extracted their audio features using the Spotify Web API.
We were able to predict the Billboard success of a song with approximately 86% accuracy.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this study, we approached the Hit Song Prediction problem, which aims to
predict which songs will become Billboard hits. We gathered a dataset of nearly
18500 hit and non-hit songs and extracted their audio features using the
Spotify Web API. We test four machine-learning models on our dataset. We were
able to predict the Billboard success of a song with approximately 86\%
accuracy. The most succesful algorithms were Random Forest and Support Vector
Machine.
Related papers
- Learning-Augmented Algorithms with Explicit Predictors [67.02156211760415]
Recent advances in algorithmic design show how to utilize predictions obtained by machine learning models from past and present data.
Prior research in this context was focused on a paradigm where the predictor is pre-trained on past data and then used as a black box.
In this work, we unpack the predictor and integrate the learning problem it gives rise for within the algorithmic challenge.
arXiv Detail & Related papers (2024-03-12T08:40:21Z) - Beyond Beats: A Recipe to Song Popularity? A machine learning approach [2.6422127672474933]
This study aims to explore the predictive power of various machine learning models in forecasting song popularity.
We employ Ordinary Least Squares (OLS) regression analysis to analyse song characteristics and their impact on popularity.
Random Forest emerges as the most effective model, improving prediction accuracy by 7.1% compared to average scores.
arXiv Detail & Related papers (2024-03-01T17:14:41Z) - GETMusic: Generating Any Music Tracks with a Unified Representation and
Diffusion Framework [58.64512825534638]
Symbolic music generation aims to create musical notes, which can help users compose music.
We introduce a framework known as GETMusic, with GET'' standing for GEnerate music Tracks''
GETScore represents musical notes as tokens and organizes tokens in a 2D structure, with tracks stacked vertically and progressing horizontally over time.
Our proposed representation, coupled with the non-autoregressive generative model, empowers GETMusic to generate music with any arbitrary source-target track combinations.
arXiv Detail & Related papers (2023-05-18T09:53:23Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - An Analysis of Classification Approaches for Hit Song Prediction using
Engineered Metadata Features with Lyrics and Audio Features [5.871032585001082]
This study aims to improve the prediction result of the top 10 hits among Billboard Hot 100 songs using more alternative metadata.
Five machine learning approaches are applied, including: k-nearest neighbours, Naive Bayes, Random Forest, Logistic Regression and Multilayer Perceptron.
Our results show that Random Forest (RF) and Logistic Regression (LR) with all features outperforms other models, achieving 89.1% and 87.2% accuracy, and 0.91 and 0.93 AUC, respectively.
arXiv Detail & Related papers (2023-01-31T09:48:53Z) - Exploring the Efficacy of Pre-trained Checkpoints in Text-to-Music
Generation Task [86.72661027591394]
We generate complete and semantically consistent symbolic music scores from text descriptions.
We explore the efficacy of using publicly available checkpoints for natural language processing in the task of text-to-music generation.
Our experimental results show that the improvement from using pre-trained checkpoints is statistically significant in terms of BLEU score and edit distance similarity.
arXiv Detail & Related papers (2022-11-21T07:19:17Z) - Context-Based Music Recommendation Algorithm Evaluation [0.0]
This paper explores 6 machine learning algorithms and their individual accuracy for predicting whether a user will like a song.
The algorithms explored include Logistic Regression, Naive Bayes, Sequential Minimal Optimization (SMO), Multilayer Perceptron (Neural Network), Nearest Neighbor, and Random Forest.
With the analysis of the specific characteristics of each song provided by the Spotify API, Random Forest is the most successful algorithm for predicting whether a user will like a song with an accuracy of 84%.
arXiv Detail & Related papers (2021-12-16T01:46:36Z) - Hit Song Prediction Based on Early Adopter Data and Audio Features [5.88864611435337]
This research provides a new strategy for assessing the hit potential of songs.
A number of models were developed that use both audio data and social media listening behaviour.
The results show that models based on early adopter behaviour perform well when predicting top 20 dance hits.
arXiv Detail & Related papers (2020-10-16T06:42:40Z) - dMelodies: A Music Dataset for Disentanglement Learning [70.90415511736089]
We present a new symbolic music dataset that will help researchers demonstrate the efficacy of their algorithms on diverse domains.
This will also provide a means for evaluating algorithms specifically designed for music.
The dataset is large enough (approx. 1.3 million data points) to train and test deep networks for disentanglement learning.
arXiv Detail & Related papers (2020-07-29T19:20:07Z) - Predicting Afrobeats Hit Songs Using Spotify Data [0.0]
A dataset of 2063 songs was generated through the Spotify Web API.
Random Forest and Gradient Boosting algorithms proved to be successful with approximately F1 scores of 86%.
arXiv Detail & Related papers (2020-07-07T00:14:30Z) - Jukebox: A Generative Model for Music [75.242747436901]
Jukebox is a model that generates music with singing in the raw audio domain.
We tackle the long context of raw audio using a multi-scale VQ-VAE to compress it to discrete codes.
We show that the combined model at scale can generate high-fidelity and diverse songs with coherence up to multiple minutes.
arXiv Detail & Related papers (2020-04-30T09:02:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.