309 e-ISSN: 2980-4108 p-ISSN: 2980-4272 IJEBSS
IJEBSS Vol. 1 No. 04, April 2023, pages: 308-320
hip-hop etc. Over the years, the idea and perception of music has evolved, which has given rise to further genres and
sub-genres. The way people consume music has also developed with the advancement of technology.
Music genre classification, a subset of music information retrieval, is a challenging and progressive task in
the AI domain. It basically involves using machine learning concepts and algorithms to recognize the genre of a
particular music audio file, the style or category of the music. For example the algorithm tries to differentiate between
a rock music file and a classical music file based on the features of the audio. Classifying the genre of a piece of music
automatically has manifold uses in this modern world. The applications are plenty. It can be used in an audio streaming
platform or app (e.g. Soundcloud, Spotify, Gaana) for categorization and music recommendation, which then can be
used to curate playlists based on the genre. The algorithm can be simply released as a product and can be used as a
music identification app (e.g. Shazam). It can also be used in smart bots like Alexa, Google Assistant, Siri present in
our smartphones to enhance the music listening experience of the user.
In this paper we have conducted music genre classification using various machine learning algorithms and
compared the accuracy attained by the discrete algorithms. We also focus on ensemble techniques for music genre
classification and whether it is more efficient than classical algorithms. The paper is structured as follows: - related
work discusses the past work done on the topic of music genre classification, followed by dataset and data
preprocessing, followed by models which lists the machine learning models we have used to achieve our task, followed
by results and discussions rounding up the results we achieved, followed by future scope and conclusion, ending with
references.
There has been quite a bit of work on the subject of music genre classification using convolutional neural
networks, recurrent neural networks, combination of both and even feed forward networks.
K. Choi et al [1] presented a convolutional recurrent model for recognizing genre, moods, instruments and
era from the Million Songs dataset. They used a 2D convolution model followed by recurrent layers and fully connected
layers to perform the classification task. Our CRNN model described is similar but we used 1D convolution layers
instead of 2D (Choi et al., 2017).
P. Kozakowski & B. Michalak [2] from DeepSound used a 1D convolution model followed by a time
distributed dense layer on GTZAN dataset. We got the idea for 1D convolution layers from them, but found that the
RNN layers after 1D CNN performed better for our dataset.
L. Feng et al [3] paralleled CNN and RNN blocks to allow the RNN layer to work on the raw spectrograms
instead of the output from the CNN. Our parallel CNN-RNN model was heavily influenced by this paper and our final
architecture is similar to theirs with some modifications since our dataset size is much smaller (Feng et al., 2017).
N. Pelchat et al [4] reviews some of the machine learning techniques utilized in the domain. They made use
of spectrograms generated from time slices of songs as input the neural networks (Pelchat & Gelowitz, 2020).
T. Lidy et al [5] presents an approach using parallel CNN architectures to classify the genre of music files.
They worked on the MIREX 2016 Train/Test Classification Tasks for Genre, Mood and Composer detection dataset
(Lidy & Schindler, 2016).
K. K. Chang et al [6] introduces an approach of music genre classification using compressive sampling (CS).
They use a CS based classifier which uses both short term and long term features of the audio file (Chang et al., 2010).
I. Y. Jeoung et al [7] describes a framework which learns the temporal features from audio using deep neural
networks and uses them for music genre classification (Jeong & Lee, 2016).
P. Annesi et al [8] used a Support Vector Machine to design an automatic classifier of music genres. They
used certain conventional features and engineered some new ones like beat and chorus for enhanced accuracy (Annesi
et al., 2007).
Y. M. D. Chathuranga et al [9] also proposed an ensemble approach using frequency domain, temporal
domain and cepstral domain features. They used a Support Vector Machine for the base learner and used AdaBoost
technique for classification (Chathuranga & Jayaratne, 2013).
S. Jothilakshmi et al [10] explored the idea of using Gaussian Mixture Model (GMM) and k-Nearest
Neighbour (kNN) classifier for the task. They specifically worked on Indian genres like Hindustani, Carnatic, Ghazal,
Folk and Indian Western.de (Jothilakshmi & Kathiresan, 2012).