Skip to content

Kubeflow

Yiannis Gkoufas edited this page Aug 11, 2020 · 8 revisions

There are few ways you can leverage DLF to assist you while working with kubeflow.

Requirements

You have permissions to install DLF and a namespace where you can use for kubeflow (to create TFJobs, launch Jupyter notebooks etc) Lets assume the namespace you can use is {my-namespace}. Feel free to change accordingly.

Installation using kubectl

git clone https://github.com/IBM/dataset-lifecycle-framework.git
cd dataset-lifecycle-framework
git checkout fixed-caching #TODO remove when branch merged
make DATASET_OPERATOR_NAMESPACE={my-namespace} NAMESPACES_TO_MONITOR={my-namespace} deployment

If everything worked well you should see this:

NAME                               READY   STATUS              RESTARTS   AGE
csi-attacher-nfsplugin-0           2/2     Running             0          78s
csi-attacher-s3-0                  1/1     Running             0          78s
csi-nodeplugin-nfsplugin-j4ljv     2/2     Running             0          78s
csi-provisioner-s3-0               1/1     Running             0          78s
csi-s3-2gwcs                       2/2     Running             0          79s
dataset-operator-76f795587-cljfm   1/1     Running             0          77s
generate-keys-q8s99                0/1     Completed           0          67s

We will loosely follow the example posted in mnist_vanilla_k8s.ipynb

Build model container

There is a delta between existing distributed mnist examples and what's needed to run well as a TFJob. We will skip the kaniko part and just build and use the Dockerfile and model.py in examples/kubeflow

cd examples/kubeflow
docker build -t {MY-REGISTRY}/mnist-model -f Dockerfile.model .
docker push {MY-REGISTRY}/mnist-model

In case you use an authenticated registry, follow the instructions in configure-docker-credentials

Create an S3 Bucket and a dataset

If you have an existing s3 bucket you can use, please processed with this one. Otherwise follow the instructions in Configure IBM COS Storage

Now we need to create a dataset to point to the newly created bucket. Create a file that looks like this:

apiVersion: com.ie.ibm.hpsys/v1alpha1
kind: Dataset
metadata:
  name: your-dataset
spec:
  local:
    type: "COS"
    accessKeyID: "access_key_id"
    secretAccessKey: "secret_access_key"
    endpoint: "https://YOUR_ENDPOINT"
    bucket: "YOUR_BUCKET"
    region: "" #it can be empty

Now just execute:

kubectl create -f my-dataset.yaml -n {my-namespace}