Skip to content

lesterpjy/data-studies

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A collection of data science case studies

A group project on the prediction of departure flight delay using the Bureau of Transport Statistic on-time performance dataset, and weather data provided by NOAA. 11 gigabytes of data were cleaned, explored, and engineered with Apache Spark to build a gradient boosted tree model that predicts departure delay with a precision of 92% and a recall of 86%.

Analysis of BayWheels Public Bikeshare Data on Google Cloud Platform, performed with SQL on BigQuery and AI Platform to answer questions regarding BayWheels. In this analysis we answered the following questions:

  • What are the 5 most popular trips that we would call "commuter trips"?
  • What are recommendations for improving pricing?

A cross-sectional analysis of Tweets with the hashtag #DataScience. This study is conducted by Kevin Drever, Ryan Sawasaki, and Lester Yang.