This project aims to predict loan eligibility using machine learning algorithms implemented in PySpark. The goal is to automate the loan approval process, improving accuracy and efficiency.
Data Imputation: Used mean/median imputation to handle missing values.
Algorithms: Implemented Random Forest and Support Vector Machine (SVM) for classification.
Validation: Employed Stratified K-Fold Cross Validation to ensure model robustness.