Project from Data Science Masters Class in which my group predicts loan grades from Lending Club data
Members: Bipasha Kundu, Sudipta Paul and Noah Wong
- Final Report
- Final Presentation
- Data- Download the dataset from the reference and use the file called "accepted_2007_to2018Q4.csv" [1]
- Python Code
- Data Preprocessing- Reads in the data file and organizes the data. Exports the cleaned data to be used by the other two files
- Exploratory Analysis- Reads in clean data and creates charts to help understand our dataset
- Machine Learning Algorithms- Reads in clean data and creates the different machine learning models
The code was written in Jupyter Notebooks. To run the code you need to download the data file from Kaggle. Then download the Data Preprocessing file and change the path name under the “Import Data File” section to "accepted_2007_to2018Q4.csv". Then you can run this file. It will export a new data file called “Accepted.csv” which is the clean and organized dataset. Now you can run the other two python files. Again change the file name in the “Import Data File” section. Then you should be able to run both files. Exploratory Analysis will run very quickly, but the Machine Learning Algorithms take a long time to run.
To run the python files you will have to have the following libraries Data Processing- (Pandas, Numpy) Machine Learning Libraries- (Keras, Tensorflow, Scikit-learn) Data Visualization- (Matplotlib, Seaborn)