From a9ddf9a19cc47a686be157866c6d4c99ec3ac03e Mon Sep 17 00:00:00 2001 From: Wickramaranga Abeygunawardhana Date: Fri, 27 Jul 2018 08:58:29 +0530 Subject: [PATCH 1/3] Update Readme.md --- Readme.md | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/Readme.md b/Readme.md index 6933597..15745e5 100644 --- a/Readme.md +++ b/Readme.md @@ -1,17 +1,18 @@ # DengAI -Open `.ipynb` files with jupyter notebook or alternative. -Run `Preprocess.ipynb` to preprocess source files and generate learn-ready files. -In jupyterlab, `Run -> Run All` will do this. -You can tweak it and make changes and try learning with the resulting files in the `generated` folder. -`ModelSelection.ipynb` is supposed to select the best model to use via a Grid Search Cross Validation (hyperparameter optimization) per each model. But it looks like sklearn is suboptimal (or we don't know how to use it). -`DengAI.ipynb` is supposed to contain the feature selection, learning and result generation but it has not yet been completed. +## +### Presentation for CS4622 (Machine Learning) -You can use other tools to make predictions. -Results: -Matlab Ensemble Boosted Trees with 5-Fold Cross Validation: Error=24.9663 -Settings: Iq -> 7 100 0.09, Sj -> 7 100 0.1 +### Report for CS4622 (Machine Learning) -Please do not push any **changes** (on master) to these files unless the changes reduce the error. -When you are pushing a notebook, please clear all outputs. e.g.: `Edit -> Clear All Outputs`. +### Report for CS4642 (Data Mining and Information Retrieval) + +## Directory contents ++ The `.` root directory contains the data files downloaded from _drivendata_ and some milestone submissions. ++ `deprecated` folder contains the first approaches to the problem with _Matlab regression learner_ and _Orange3_ (with minimal preprocessing) and the resulting `.csv` files. ++ `Neural Networks` folder contains the first approaches to the problem with deep neural networks with _Keras_ and _Tensorflow_. ++ `Negative Binominal Regression` contains the DengAI benchmark model built with _Jupyter Notebook_ and _sklearn_, _statsmodels_ etc. ++ `Interactive Python 1` contains the approaches that do general preprocessing with _Jupyter Notebook_, _pandas_, _sklearn_, _statsmodels_, _seaborn_ and uses various models for prediction. ++ `Interactive Python 2` contains a pipeline that processes the files in various stages using _Jupyter Notebook_, _pandas_, _sklearn_, _statsmodels_, _seaborn_, and _R_'s STL (time series decomposition) borrowed with the _r2py_ bridge. This pipeline does preprocessing, visualization, analysing, automatic selection of features, best model selection etc. The best working model is a time series decomposing predicter with a linear regression model. ++ `Orange` folder contains an Orange3 pipeline that tests cross-validated errors of various learners with preprocessing, feature engineering etc. From 1f2fc6fb89f136bda45f78b6c14cc0f5160f2c0a Mon Sep 17 00:00:00 2001 From: Wickramaranga Abeygunawardhana Date: Fri, 27 Jul 2018 11:20:11 +0530 Subject: [PATCH 2/3] Update Readme.md --- Readme.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/Readme.md b/Readme.md index 15745e5..e9743b9 100644 --- a/Readme.md +++ b/Readme.md @@ -1,12 +1,17 @@ # DengAI -## +## Reports and Presentations ### Presentation for CS4622 (Machine Learning) ### Report for CS4622 (Machine Learning) ### Report for CS4642 (Data Mining and Information Retrieval) + +## Results +Current best result: 19.3798 (MAE), Rank 89 as of July 27 - 2018. + + ## Directory contents + The `.` root directory contains the data files downloaded from _drivendata_ and some milestone submissions. + `deprecated` folder contains the first approaches to the problem with _Matlab regression learner_ and _Orange3_ (with minimal preprocessing) and the resulting `.csv` files. From ac23c6aba2c3ce1485ddf0756bc1476c42f951c1 Mon Sep 17 00:00:00 2001 From: Wickramaranga Abeygunawardhana Date: Fri, 27 Jul 2018 16:59:00 +0530 Subject: [PATCH 3/3] Update Readme.md --- Readme.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Readme.md b/Readme.md index e9743b9..7a9db89 100644 --- a/Readme.md +++ b/Readme.md @@ -1,11 +1,11 @@ # DengAI ## Reports and Presentations -### Presentation for CS4622 (Machine Learning) +### [Presentation](https://github.com/umstek/DengAI/blob/master/DengAI.pdf) for CS4622 (Machine Learning) -### Report for CS4622 (Machine Learning) +### [Report](https://github.com/umstek/DengAI/blob/master/Machine%20Learning%20Report%20-%20Group%2030.pdf) for CS4622 (Machine Learning) -### Report for CS4642 (Data Mining and Information Retrieval) +### [Report](https://github.com/umstek/DengAI/blob/master/Data%20Mining%20Report%20-%20Group%2030.pdf) for CS4642 (Data Mining and Information Retrieval) ## Results