I will be performing a exploratory data analysis to understand the characteristics that a patient who shows up for his/her scheduled appointment in a healthcare dataset from Kaggle.
The project includes the following contents:
- Introduction
- Data Wrangling
- Exploratory Data analysis
- Conclusions
- Proporsal for the next step
- References
The project is using Python3. The packages in this project include numpy, pandas, matplotlib.pyplot, seaborn and scipy.stats.
Variable Name | Metadata |
---|---|
Patientid | Identification of a patient |
AppointmentID | Identification of each appointment |
Gender | M=Male, F=Female |
ScheduledDay | The dates of patient set up their appointments |
AppointmentDay | The dates of the appointments |
Age | The age of the patients (Year) |
Neighbourhood | The location of the hospitals |
Scholarship | Indicates whether or not the patients were enrolled in Brasilian welfare program |
Hipertension | 0: non-hypertension, 1: hypertension |
Diabetes | 0: non-diabetes, 1: diabetes |
Alcoholism | 0: non-alcoholic, 1: alcoholic |
Handcap | 0: False, 1: True |
SMS_received | 0: did not send any messages to the patients, 1: sent one or more messages to the patients |
No-show | Yes: the patients did not show up to their appointments, No: the patients showed up |
- Descriptive statistics (Risk Ratio for binomial variables)