Skip to content
This repository has been archived by the owner on Jan 3, 2023. It is now read-only.

Commit

Permalink
Prepare sc for py transform (#35)
Browse files Browse the repository at this point in the history
* enable notebook on bigdl_0.3.0 and spark_2.2

* changes according to code changes

* remove changes on start_notebook.sh

* 0.3.0 notebook work

* Fix docs

* unsaved changes

* change README address for 0.3.0 download

* change README

* prepare sc for jenkins

* fix matplotlib import

* declare utf-8 coding

* more declares

* spark conf to avoid OOM

* change to directly use jupyter

* specify version 0.3.0 , delete spark install in Setup.md for pyspark will be installed with pip

* setup changes

* setup changes
  • Loading branch information
chengxuHawkwood authored and yiheng committed Nov 9, 2017
1 parent 720acf3 commit 8478159
Show file tree
Hide file tree
Showing 13 changed files with 178 additions and 110 deletions.
6 changes: 2 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,8 @@ Step-by-step Deep Leaning Tutorials on Apache Spark using [BigDL](https://github
+ [Setup env on Mac OS](https://github.com/intel-analytics/BigDL-Tutorials/blob/master/SetupMac.md) / [Setup env on Linux](https://github.com/intel-analytics/BigDL-Tutorials/blob/master/SetupLinux.md)

### Start Jupyter Server
* Download BigDL 0.3.0([linux or mac](https://repo1.maven.org/maven2/com/intel/analytics/bigdl/dist-spark-2.2.0-scala-2.11.8-linux64/0.3.0/dist-spark-2.2.0-scala-2.11.8-linux64-0.3.0-dist.zip )) and unzip file.
* Run ```export BIGDL_HOME=where is your unzipped bigdl folder```
* Run ```export SPARK_HOME=where is your unpacked spark folder```
* Run ```./start_notebook.sh```
* Run ```pip install BigDL==0.3.0```
* Run ``` jupyter notebook --notebook-dir=./ --ip=* --no-browser```

## Run Demo
* Open a browser - Suggest Chrome or Firefox or Safari
Expand Down
8 changes: 6 additions & 2 deletions SetupLinux.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,13 @@ This guide is mainly for Ubuntu. If you has other linux platform, please do the

### Installation Steps

* Install Java and Spark
* Install Java
* Install Jdk 8 from http://www.oracle.com/technetwork/java/javase/downloads/index-jsp-138363.html#javasejdk
* Install Spark 2.1.0 from http://spark.apache.org/downloads.html
* Run the following steps
```
export JAVA_HOME=where you unzip your jdk
export PATH=$PATH:$JAVA_HOME/bin
```
* Install Python dev env. Python2.7 is shipped with linux.
```
sudo apt-get update
Expand Down
7 changes: 5 additions & 2 deletions SetupMac.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,12 @@
```
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
```
* Install Java and Spark
* Install Java
* Install Java on OSX following the guide https://java.com/en/download/help/mac_install.xml
* Install Spark on OSX http://spark.apache.org/downloads.html
```
export JAVA_HOME=where you unzip your jdk
export PATH=$PATH:$JAVA_HOME/bin
```
* (Optional) (Mac) Install Python. Python is shipped with MacOS, but you may want to install updates using Homebrew
```
brew install python
Expand Down
3 changes: 2 additions & 1 deletion ipynb2py.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@
# Example:
# ipynb2py notebooks/neural_networks/rnn
#########################################

if [ $# -ne "1" ]; then
echo "Usage: ./nb2script <file-name without extension>"
else
Expand All @@ -17,5 +16,7 @@ else
jupyter nbconvert --to script $1.tmp.ipynb

mv $1.tmp.py $1.py
sed -i '1i# -*- coding: utf-8 -*-' $1.py
sed -i '#!/usr/bin/python' $1.py
rm $1.tmp.ipynb
fi
25 changes: 16 additions & 9 deletions notebooks/neural_networks/autoencoder.ipynb

Large diffs are not rendered by default.

41 changes: 24 additions & 17 deletions notebooks/neural_networks/birnn.ipynb

Large diffs are not rendered by default.

39 changes: 23 additions & 16 deletions notebooks/neural_networks/cnn.ipynb

Large diffs are not rendered by default.

31 changes: 19 additions & 12 deletions notebooks/neural_networks/deep_feed_forward_neural_network.ipynb

Large diffs are not rendered by default.

14 changes: 10 additions & 4 deletions notebooks/neural_networks/introduction_to_mnist.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -30,17 +30,23 @@
"name": "stdout",
"output_type": "stream",
"text": [
"Populating the interactive namespace from numpy and matplotlib\n"
"Populating the interactive namespace from numpy and matplotlib\n",
"Prepending /usr/local/lib/python2.7/dist-packages/bigdl/share/conf/spark-bigdl.conf to sys.path\n"
]
}
],
"source": [
"import matplotlib\n",
"matplotlib.use('Agg')\n",
"# As always, a bit of setup\n",
"%pylab inline\n",
"import pandas\n",
"from bigdl.dataset import mnist\n",
"from bigdl.util.common import *\n",
"\n",
"import matplotlib.pyplot as plt\n",
"from pyspark import SparkContext\n",
"from matplotlib.pyplot import imshow\n",
"sc=SparkContext.getOrCreate(conf=create_spark_conf().setMaster(\"local[4]\").set(\"spark.driver.memory\",\"2g\"))\n",
"init_engine()"
]
},
Expand Down Expand Up @@ -106,15 +112,15 @@
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAABECAYAAACRbs5KAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAE09JREFUeJztnX9QVOUax7+LLFgiIkKCqVCZGXJ1g0ozRmWi1PSqTReV\nEbwwYz+sNJ2xlFHJYhxIp1BJC8zhjsjVCK1srtxqQI20mDRtmAo1MqDAH4MiRDuwZ8/3/uHsuayg\nLXDOQtvzmXn+2XP2PM95zznfffZ9n/c9JpIQBEEQ/vx49XYAgiAIgj6IoAuCIHgIIuiCIAgeggi6\nIAiChyCCLgiC4CGIoAuCIHgIIuiCIAgeggi6IAiChyCCLgiC4CF4u9OZyWSSaamCIAhdhKTJlf0k\nQxcEQfAQRNAFQRA8BBF0QRAED0EEXRAEwUMQQe8m0dHRiI6ORl5eHux2O/Ly8hAVFdXbYQl/UbZs\n2YItW7aAJCoqKhAWFtbbIQldpKSkBKWlpT07CEm3GQB2x/r168d+/foxMDBQs7S0NGZmZjIzM5Mf\nfPABhw0bxn//+98kSavVSqvVyldeeaVb/v7ILBYLL1++zMuXL1NRFM0aGhoM8dcde+SRR3j+/Hne\nc889vOeee9zqe+3atbTb7XQwZcqUXm8Pd9nAgQM5cOBAhoaGcvHixUxNTaWvr6+hPsPDw9nQ0MCG\nhgba7XYqisJp06a5/dxHjx7NsWPHcsmSJSRJu93eqe3fv5/79++nj4+PIXGYzWaazWZOmTKFR48e\n7fV7whXLysqi1WplTk5Op9td1Vi3li26ysiRI+Hj44NJkyYhJiYGAQEBAIAnn3yy0/1/+eUXbN26\nFU888QSam5vx7bffAgCOHDmie2wPPvgg9u3bh0GDBgEASKK5uRltbW0YMmQIJk6cCAD45ptv0NbW\n1qVjT548GUOGDMEHH3zQ4zgfeOABfP311z0+TldJTk7GqlWroKqq9tlf4SUq4eHhWLVqFR566CEA\nQGRkpLYtNDQUy5YtM8z3pUuX8PnnnwMAZs+ebZifzhg7diySk5MBAPHx8fDy8sKwYcOgquoNr7sj\nxnfeeQfLly9HU1OTrjE5ns1Dhw7h/PnzCAkJwfnz53X1oReZmZkAgGeffRY2mw0lJSU9Ol6fEnSL\nxQIAKC0t1S7KzXCIxtq1a/Hbb7+hoKAA9fX1uHLlCgDg9OnTusV26623IioqCrt370ZoaKjTtrNn\nz2Ljxo3Yu3cvjh49qsWUkZHRJR9Tp07F3Xff3WNB9/Lywh133IGwsDCYTC6Vr+pGWFgY+vfv7xZf\nEyZMAAAkJiZiypQpGDt2LABg5cqVAIC6ujrExMRg9+7dKC8vNySGMWPGYPny5Vi4cCFuueUWrb1r\na2vR3NyMe++9F/PmzcP27dsBAJWVlbrH0NLSgurqat2P6woZGRl4/PHHu/XdRYsWYefOndozYwQh\nISF9WtAdCaDZbMYXX3yBwsLCHh1P+tAFQRA8hD6VodfU1AAAGhoabpihl5eXo7GxEbGxsVqXRn5+\nvuGx5eTkICEhodNtUVFR8PPzw5EjRzB16lQAwLhx47rsY9GiRfjyyy97EiaAa3/xn3rqKezevduQ\njPBGxMXFYenSpQCuZaKzZs0CAFy4cEF3X/Pnz8eWLVsAAEFBQTCZTDh8+DCCg4OxadMmbT+TyYTg\n4GAsWLBAN9+DBg3C66+/rsUxcOBAbdvZs2cBANOmTYPZbEZlZSWCgoIQFBSkm//rCQgIwPjx4w07\n/s347LPPnDL0ixcvYufOnfDy8nLqdps0aRKmTJni9vjc/Q+1PZMnT8aaNWuQkJCAy5cvd9iekJCg\ndc1VVVVp/yx7RF8cFJ07dy7fffddPv/8806DKSdOnOCAAQMIgGPHjmVubi5zc3MNHayIjo5mdHQ0\nL1++rMVRWlrKFStWcMWKFbTb7aytreX48eM5Z84cbSBw7969Xfb1008/MT8/v8cxFxcXU1VVrlu3\nzi0DOjExMYyJiWFtbS1tNhttNhsXLVpkiC9vb29OnDiRTU1N2mB0aWkpY2NjaTab6efnx4MHD/Lg\nwYNUFIV2u50rV67UNYbk5GSnwXCHnT59miNGjOCIESMIgKNGjdK2OdrIiDYZNmwYq6qqWFVVpQ2K\nrl27lmFhYYZfe29vb+2cR4wYwZCQkE738/f3Z01NjVN7FRUVGTJgHBQUxKCgIKqqSlVVOXHiRMPb\noTOrrKyk3W6/4XWvqKjQ9OKJJ5646bFc1ti+KOiOG8BkMjE3N1cT0oSEBLdekM6qWT7++GP6+flx\n5syZnDlzJlNTUxkcHKx9xxFrc3Mzo6KiXPY1btw4trS06CLox44dc+uNvGPHDu7YsUM795KSEsN8\ntRfT4uJiFhcX09/fX9uemJjoJBrV1dVO10cP+89//uPk48cff+SePXs4cuRIp/3+/ve/u0XQAXDd\nunVct26dJuiKovCFF15wy/V3xeLj49nc3OzUbps3bzbE1/WC3lvt8M0331BRFMbFxXXYZrFY2NTU\n5LK2/ekF3WGbNm1yyoy9vLzccjFGjx7NgoICzfeFCxd46tQp/uMf/7jp9xz7K4rCgoICl/2tXr2a\nqqr2WNCHDh3K+vp6qqqqZYpGWlBQkHbONpuNly5dYmxsrCG+0tPTtbbdunUr/f39ncQcAH/44Qcn\n0ZgzZ47ucQwbNozr16/n+vXrOWnSJN52222d7rd48WK3CXr7+6+vCfqCBQtYUlLS4R/N9ddOLwsI\nCGBAQACvXLlCVVWZlZXl9nNOT0+nzWZjRUVFh4RiwIAB3LNnDxVF4dGjR3n06FGazeabHs9VjZVB\nUUEQBE+hr2foAwYMYGlpKUtLS2m32/nYY48Z/uvq6+vLAwcOUFEUNjY2srGxkdOmTeOQIUM4fPjw\nm363fYZeVlbmss+8vDyqqsrVq1f3KPb8/HyqqsrKykoGBAQY2k7h4eE8ceKEU4aelpamu5+0tDSm\npaXRbrfTarXyww8/5C233OK0T//+/Tl79my2tLRo8bz66quG3ys3s507d7o9Qyf/P6GnNzP0hQsX\nsqKighUVFbRarU6Z+fHjx3n8+PEO11BvO3DggNszdMdYwvnz52m1WjudVJeTk0NFUVhTU9OV6+qS\nxvapKpfOaGlpwVNPPQXg2mSdHTt24NChQzh+/Di2bdsGAI4fC9247777tJH7OXPmADBmklJndHUy\nkL+/P6ZPn47ExEQAwGOPPQYASE9PR2Njo+7xtWf69OlO1TwlJSVa5YleBAQE4LnnngNw7Tp/8skn\nmDt3rtM+o0aNQkFBAaKjowEARUVFAICNGzfqGsvNWLZsGQYMGOD02d/+9jcAwLFjx3SpXnKFm03o\nMYLw8HAkJSUBuFbl5CAmJqZDHE1NTVi9ejUOHjwIALBarW6L0x1ERkZqc0iCgoKQnZ3dQTdWrlyp\nTcTasGGD7jH0eUEHrpX0ANdmIebl5SEpKQlJSUnaA7Rr1y7U19fr5u/NN9+EyWTCkSNHuizkXl7X\nerHal2x1hcDAwA6fjR8/HiaTCXFxcRg+fDgAwMfHBwsXLoSXlxesVqs2caa1tRXe3t44ceJEt/y7\nyty5c7VZbl988QUA4J///CeuXr2qqx8fHx+nkr9ly5bhtttuQ0pKijbjMDIyEn5+flqWsnv3bgDX\nkgGjuPXWWwEAEREReOWVV7QE4PrrX1dXh5SUFNjtdsNi6S0iIyNx4MABjBw50qX9y8rKkJuba3BU\nHRkyZIihx/f29kZiYqJWrglcu/4PPfQQUlNT8eabb2rPdXx8PEwmE3bt2oWcnBz9g+nrXS7XW2Rk\nJD/99FOncsbt27fz9ttv1+Uv06xZs/j7779TURQuX768y99v3+WSnZ3t8ve2b99Ou93OhoYGnjx5\n0snsdjtVVWVbW5vWBXTs2DFmZWVx4cKFHD58uLZ+xYULF9jW1mbo38rw8HCn9s/Ly2NeXp4hvgIC\nAlhfX8/6+nqtDPH6wbWamhrW1tZSURTW19cbeu5ms5kPPvgga2trNZ/Nzc2sra1lYWEhm5qanEoq\n6+vr+dJLL9HHx8ewtUuuv//cNSgaGRnJc+fOdbpeS/uun/Y2Y8YMw9vAYY4ul8bGRkP9tK+scpzn\n6dOntc+++uorp/ulO/eox1S5dGYBAQFMSkpyasTPPvtMl2PHx8dTURTW1dUxNDTU5e/5+voyIyND\nu6Cffvop/fz8uuR71apV/Oijjzq1lJSUm5YhPv3003z66aepqip//PFHQ2/gt99+W6s3t9lshi8A\nNmHCBE6YMIGXLl3SHpaNGzcyIiKCERERDAkJ4eHDh6koiqH9pT4+Ppw9e7bTj8m6dev48MMPEwAD\nAwN56tQpnjp1qsOPzvz58zl//nzDF+pqL6SFhYWG+gLAsLAwrlmzhmvWrOH999/PyMjIDpaVlaW1\ngzsFfcWKFYYL+vz582mz2Wi1WllfX8/Y2FjGxsbSYrE4Vfa0T/RsNhtra2t51113deW6SpWLIAjC\nX4o/Y4busNbWVra2ttJut7O1tZVTp07t8TEdGfq5c+dc/o6vry/T09O1iSzV1dVuX770vffe43vv\nvUdVVfn6668b4sNisdBisbCqqkrLzouKitx6np3Z5MmTtcx06dKlhvgwm83MyMhwmmD28ccfa5VE\nwcHB/Prrr7VMzGq18tVXX+W+ffucMvX//ve/WgZnsVh0j/P6LinHv5jevD6DBg3qlQz9ySefpKqq\nbGlpMWzWbGlpKauqqpiSktJhW0REBMvKyjpk6IqicNeuXV3y47FdLuPGjeNrr73G4uJip765kydP\n6jLpyCHoW7ZscWl/i8XCgoICKorCffv2ue1mvd7aC7pRM0QvXrzIixcvamJeVlbW5W4lI2zatGna\nw6L3rFDHWvyZmZlUFIVXr17lkiVLOHjwYA4ePJgAeP/99/Orr76ioiisrKxkZWWlNrnK39+f06dP\nZ35+PvPz83n16lXtoe5K0uCqbdu2rcNsTKNmZLpq8+bN6xVBnzNnDlVV5e+//87Ro0cb4uPFF1+8\n4QS+yZMn88qVK1QUhfPmzeO8efO0H9iuTqryKEF39NFmZ2fz119/7TDQ0tbWxoMHD+p289ntdlZX\nV990P8daLo41Xrr6i6u3uUPQ29eb22w2ty/F8EexGSHoS5Ys4ZIlS6goCpuamrhgwQIGBgZyxowZ\nnDFjBgsLC7Up7WlpaU5ruXRmCQkJWnY/atQo3dth6dKlhgu62WzmzJkz/7COPCUlhSkpKU6DxO4U\ndAD8/vvvqaoqt2/f7la/gwYNYnZ2Nu12O8+cOdPj43mEoIeEhHDFihVOCw+1t/LycpaXl3P27Nm6\nXQhHht7a2sqtW7dqf41HjBjB+Ph4HjhwgNXV1VoM586d4549e3ptASCHOQSdpCELY+Xl5dGB49zd\nsfiTK2Zkht6+wqalpYUnTpxgZWVlh0HPtWvXsl+/fr3eFgB45swZp2oTkl0agLuROSZHFRcXU1GU\nG/5wBQYGMjExkVeuXNEyVEc1kFHLQtzINm/ezKtXr7J///5u9ZuamqpVtPzRZERXzFWN7ZN16EOH\nDkVERATeeustjBkzpsP28vJybNq0CR999BGA7td834x+/frhueee096S1NTUhLvvvlvbfuzYMQDX\n3oqSlpamu//uQlKrhdULi8WCuLg4rZ3b2tqwbds2Q5bF7Q533nmnYcd2vBghODgYvr6+2jK1jskx\nn3/+OT788EP8/PPPfabW/LvvvtPaRM9n46233gLw/7cxvfzyy2hubu6w36OPPoqoqChHEgcAOHz4\nMN5++20cOnRIt3hchWSX3x7WXRzvcl28eDFIIjc3F7/88otbfAPyggtBEATPoa90uQQGBvL999/n\n+++/z7Nnz3Y6KaGsrIxz5841dA2I4cOH88svv+y0flRRFF64cMHlAVN3Wvs+9Bu9aLa7NnXqVNps\nNq0tjK5z76pFRkZqXUF6d7k4XvqclJTErKwspqamcujQoW6bKNQdmzFjRof7V48ulxvV2Hdmdrud\ndXV1rKurY05Ojtu7PBy2efNmqqr6h+uN62VnzpzhmTNnqCgK//Wvf+l23D9NH/qECRNYVFTEmpqa\nTkW8ubmZGzZs4IYNG7SXWxhtoaGhXL9+fQdBf+ONNwwZyNLD2veh/9UEHYD2EPX2WEZfsLCwMFZU\nVOgu6I7xpPaLjl1vp0+f5smTJ7l161ZtYlFvtkVdXR2tVivHjBnjFn+pqala/7mePyJ/GkHPzMzs\nIOIVFRXMyMhgenq64SsGeoolJyczOTnZkAw9JCSER44c6dOC7njxRUlJSZ+ovfZk8/X15TPPPMNL\nly5RUa69eaioqIjPPPPMDd9Y1Fu2d+9efvvtt31mAL+75qrGmtoPXBiNyWRynzPhL4W/vz8KCwsR\nFxeH/fv3AwBSUlIMXaBLENwFSZdejiqDooIgCJ5Cb3e5iInpZf7+/szOznaa9t7bMYmJ6WHS5SII\nguAhuNrl4lZBFwRBEIxD+tAFQRA8BBF0QRAED0EEXRAEwUMQQRcEQfAQRNAFQRA8BBF0QRAED0EE\nXRAEwUMQQRcEQfAQRNAFQRA8BBF0QRAED0EEXRAEwUMQQRcEQfAQRNAFQRA8BBF0QRAED0EEXRAE\nwUMQQRcEQfAQRNAFQRA8BBF0QRAED0EEXRAEwUMQQRcEQfAQRNAFQRA8BBF0QRAED0EEXRAEwUP4\nHxebd0nIMiF9AAAAAElFTkSuQmCC\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x7fd41f480b50>"
"<matplotlib.figure.Figure at 0x7f7294f10c90>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"imshow(np.column_stack(train_images[0:10].reshape(10, 28,28)),cmap='gray'); axis('off')\n",
"imshow(np.column_stack(train_images[0:10].reshape(10, 28,28)),cmap='gray'); plt.axis('off')\n",
"print \"groud true labels: \"\n",
"print train_labels[0:10]"
]
Expand Down
21 changes: 14 additions & 7 deletions notebooks/neural_networks/linear_regression.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,14 @@
"name": "stdout",
"output_type": "stream",
"text": [
"Populating the interactive namespace from numpy and matplotlib\n"
"Populating the interactive namespace from numpy and matplotlib\n",
"Prepending /usr/local/lib/python2.7/dist-packages/bigdl/share/conf/spark-bigdl.conf to sys.path\n"
]
}
],
"source": [
"import matplotlib\n",
"matplotlib.use('Agg')\n",
"%pylab inline\n",
"import pandas\n",
"import datetime as dt\n",
Expand All @@ -37,7 +40,11 @@
"from bigdl.optim.optimizer import *\n",
"from bigdl.util.common import *\n",
"from bigdl.util.common import Sample\n",
"import matplotlib.pyplot as plt\n",
"from bigdl.dataset.transformer import *\n",
"from matplotlib.pyplot import imshow\n",
"from pyspark import SparkContext\n",
"sc=SparkContext.getOrCreate(conf=create_spark_conf().setMaster(\"local[4]\").set(\"spark.driver.memory\",\"2g\"))\n",
"\n",
"init_engine()"
]
Expand Down Expand Up @@ -166,15 +173,15 @@
"text": [
"predict predict: \n",
"\n",
"[ 1.48450136]\n",
"[ 2.33639312]\n",
"\n",
"[ 3.27808809]\n",
"[ 2.0999496]\n",
"\n",
"[ 1.87193513]\n",
"[ 1.83889556]\n",
"\n",
"[ 3.40717745]\n",
"[ 1.95105672]\n",
"\n",
"[ 4.11920691]\n",
"[ 2.58288598]\n",
"\n"
]
}
Expand Down Expand Up @@ -205,7 +212,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"8.19934\n"
"8.20632\n"
]
}
],
Expand Down
Loading

0 comments on commit 8478159

Please sign in to comment.