Skip to content

ttperr/SD201_Project

Repository files navigation

SD201 Data Mining and Machine Learning Project

TEAM MEMBERS : Farah JABRI, Yassine BENBIHI, Louiza AOUAOUCHE & Tristan PERROT

This project called "Music genre classification based on lyrics" has been realized only in an academic context by students of Télécom Paris.

This folder is composed by:

  • Scrapping:
    • geniusScrapping.py : file containing the scrapping method code used through Genius website.
  • datasets/genius-scrap.csv : resulting csv file from the scrapping : it contains all the lyrics we used*
  • stopwords.json : the stopword list we used
  • notebook.ipynb : contains all the process from data cleaning to the modeling results.

This is the Kaggle dataset we used. Download them in datasets/kaggle-lyrics-data.csv and datasets/kaggle-artists-data.csv

Warning : some of libraries have to be installed before launching the notebook if the user's computer is not previously equipped (e.g langdetect, nltk). For that the following command has to be executed : pip install [name of the package]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published