All Projects → graykode → aws-kubeflow

graykode / aws-kubeflow

Licence: other
A guideline for basic use and installation of kubeflow in AWS.

Programming Languages

Jupyter Notebook
11667 projects
shell
77523 projects

Projects that are alternatives of or similar to aws-kubeflow

mlops-platforms
Compare MLOps Platforms. Breakdowns of SageMaker, VertexAI, AzureML, Dataiku, Databricks, h2o, kubeflow, mlflow...
Stars: ✭ 293 (+713.89%)
Mutual labels:  kubeflow
iskan
Kubernetes Native, Runtime Container Image Scanning
Stars: ✭ 35 (-2.78%)
Mutual labels:  eks
pixie
Instant Kubernetes-Native Application Observability
Stars: ✭ 3,238 (+8894.44%)
Mutual labels:  eks
eks-anywhere-prow-jobs
This repository contains Prowjob configurations for Amazon EKS Anywhere. You can view the jobs at https://prow.eks.amazonaws.com.
Stars: ✭ 14 (-61.11%)
Mutual labels:  eks
eks-nvme-ssd-provisioner
EKS NVMe SSD provisioner for Amazon EC2 Instance Stores
Stars: ✭ 50 (+38.89%)
Mutual labels:  eks
my-cluster
My Kubernetes cluster
Stars: ✭ 27 (-25%)
Mutual labels:  eks
eks-anywhere
Run Amazon EKS on your own infrastructure 🚀
Stars: ✭ 1,633 (+4436.11%)
Mutual labels:  eks
awesome-AI-kubernetes
❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc
Stars: ✭ 95 (+163.89%)
Mutual labels:  kubeflow
ekz
An EKS-D Kubernetes distribution for desktop
Stars: ✭ 87 (+141.67%)
Mutual labels:  eks
eksutil
Sample project to call Kubernetes API of an Amazon EKS cluster from AWS Lambda
Stars: ✭ 26 (-27.78%)
Mutual labels:  eks
k8s-istio-observe-frontend
Angular 12-based front-end UI for k8s Golang observability project: https://github.com/garystafford/k8s-istio-observe-backend/tree/2021-istio
Stars: ✭ 20 (-44.44%)
Mutual labels:  eks
kserve
Serverless Inferencing on Kubernetes
Stars: ✭ 1,621 (+4402.78%)
Mutual labels:  kubeflow
ssm-agent-daemonset-installer
A DaemonSet to apply configuration to Kubernetes worker nodes after they've been bootstrapped.
Stars: ✭ 19 (-47.22%)
Mutual labels:  eks
k3ai-core
K3ai-core is the core library for the GO installer. Go installer will replace the current bash installer
Stars: ✭ 23 (-36.11%)
Mutual labels:  kubeflow
DataEngineering
This repo contains commands that data engineers use in day to day work.
Stars: ✭ 47 (+30.56%)
Mutual labels:  eks
cdk-examples
AWS CDK Examples Repository
Stars: ✭ 49 (+36.11%)
Mutual labels:  eks
eks
AWS EKS - kubernetes project
Stars: ✭ 149 (+313.89%)
Mutual labels:  eks
kubernetes
Kubernetes Course
Stars: ✭ 19 (-47.22%)
Mutual labels:  eks
eks-hpa-profile
An eksctl gitops profile for autoscaling with Prometheus metrics on Amazon EKS on AWS Fargate
Stars: ✭ 26 (-27.78%)
Mutual labels:  eks
lifecycle-manager
Graceful AWS scaling event on Kubernetes using lifecycle hooks
Stars: ✭ 89 (+147.22%)
Mutual labels:  eks

AWS-Kubeflow

AWS-Kubeflow is a guideline for basic use and installation of kubeflow in AWS.

What is kubeflow?

Kubeflow is a Cloud Native platform for machine learning based on Google’s internal machine learning pipelines. Quickly get running with your ML Workflow

The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable. Our goal is not to recreate other services, but to provide a straightforward way to deploy best-of-breed open-source systems for ML to diverse infrastructures. Anywhere you are running Kubernetes, you should be able to run Kubeflow.

Architecture

Introduce about Requirement for kubeflow

  • eksctl : is a simple CLI tool for creating clusters on EKS - Amazon's new managed Kubernetes service for EC2. It is written in Go, and uses CloudFormation.
  • kubectl : The Kubernetes command-line tool.
  • aws-cli : AWS Command Line Interface.
  • aws-iam-authenticator
  • ksonnet : A CLI-supported framework that streamlines writing and deployment of Kubernetes configurations to multiple clusters.
  • jq : jq is a lightweight and flexible command-line JSON processor.

Install Kubeflow(v0.5.0)

Start with a Ubuntu 16.04 EC2 for kubernetes controller Should >= c4.xlarge (7.5GB Memory, 20GB >= Storage), Open All TCP Port Inbound for test.

I recommend EC2 than docker container, because it is more easy to tunneling with DashBoard.

Connect to your EC2.

  1. Install requirements
$ sudo su

$ apt update && \
  apt install python python-pip curl groff vim jq gzip git -y
  
# install kubectl
$ curl -o kubectl https://amazon-eks.s3-us-west-2.amazonaws.com/1.11.5/2018-12-06/bin/linux/amd64/kubectl && \
  chmod +x kubectl && \
  mv kubectl /usr/bin/

# kubectl version check
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.5", GitCommit:"753b2dbc622f5cc417845f0ff8a77f539a4213ea", GitTreeState:"clean", BuildDate:"2018-12-06T01:33:57Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}


# install aws-iam-authenticator
$ curl -o aws-iam-authenticator https://amazon-eks.s3-us-west-2.amazonaws.com/1.11.5/2018-12-06/bin/linux/amd64/aws-iam-authenticator && \
  chmod +x aws-iam-authenticator && \
  mv aws-iam-authenticator /usr/bin/
  
  
# install awscli
$ pip install awscli --upgrade

# awscli version check
$ aws --version
aws-cli/1.16.169 Python/2.7.12 Linux/4.4.0-1083-aws botocore/1.12.159


# install eksctl
$ curl --silent --location "https://github.com/weaveworks/eksctl/releases/download/latest_release/eksctl_$(uname -s)_amd64.tar.gz" | tar xz -C /tmp && \
  mv /tmp/eksctl /usr/local/bin
  
# eksctl version check
$ eksctl version
[ℹ]  version.Info{BuiltAt:"", GitCommit:"", GitTag:"0.1.33"}
  1. AWS IAM key environment variable registration
$ export AWS_ACCESS_KEY_ID=<KEY>
$ export AWS_SECRET_ACCESS_KEY=<KEY>
  1. Elastic Kubernetes Clustering using eksctl
# create cluster
$ eksctl create cluster eks-cpu \
--node-type=c4.xlarge \
--timeout=40m \
--nodes=2 \
--region=ap-northeast-2
  • You should make node >= c4.xlarge.
  • --node-type, --region, --nodes : select node-type, region, number of nodes.
  • It takes a lot of time to make, so drink coffee.
  • eksctl will setting availability zones, subnets, make nodegroup with EC2 instances, Auto Scaling Group and Elastic Kubernetes Cluster(EKS), etc.
  1. When the eks are complete, check the node using the following command:
$ kubectl get nodes "-o=custom-columns=NAME:.metadata.name,MEMORY:.status.allocatable.memory,CPU:.status.allocatable.cpu,GPU:.status.allocatable.nvidia\.com/gpu"
NAME                                                MEMORY      CPU       GPU
ip-192-168-12-60.ap-northeast-2.compute.internal    7548168Ki   4         <none>
ip-192-168-55-153.ap-northeast-2.compute.internal   7548172Ki   4         <none>
  1. (Option) If you used GPU instances
$ kubectl apply -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v1.11/nvidia-device-plugin.yml
  1. Install ksonnet
# install ksonnet
$ wget https://github.com/ksonnet/ksonnet/releases/download/v0.13.1/ks_0.13.1_linux_amd64.tar.gz && \
   tar -xvf ks_0.13.1_linux_amd64.tar.gz && \
   mv ks_0.13.1_linux_amd64/ks /usr/local/bin

# ksonnet version check
# ksonnet had ended in github, lastest version is 0.13.1
$ ks version
ksonnet version: 0.13.1
jsonnet version: v0.11.2
client-go version: kubernetes-1.10.4

Install Kubeflow

  1. Run the following commands to download the latest kfctl.sh
$ export KUBEFLOW_SRC=/tmp/kubeflow-aws
$ export KUBEFLOW_VERSION=v0.5-branch

$ mkdir -p ${KUBEFLOW_SRC} && cd ${KUBEFLOW_SRC}
$ curl https://raw.githubusercontent.com/graykode/aws-kubeflow/master/kubeflow.sh | bash

$ curl -O https://raw.githubusercontent.com/graykode/aws-kubeflow/master/util.sh && \
   mv util.sh ${KUBEFLOW_SRC}/scripts/aws/util.sh
  1. We should follow Initial cluster setup for existing cluster document.
$ export KFAPP=kfapp
$ export REGION=ap-northeast-2
$ export AWS_CLUSTER_NAME=eks-cpu

# check your nodegroup role name
$ aws iam list-roles \
    | jq -r ".Roles[] \
    | select(.RoleName \
    | startswith(\"eksctl-$AWS_CLUSTER_NAME\") and contains(\"NodeInstanceRole\")) \
    .RoleName"
    
eksctl-eks-cpu-nodegroup-ng-11598-NodeInstanceRole-S6OPLB7TW3RR

$ export AWS_NODEGROUP_ROLE_NAMES=eksctl-eks-cpu-nodegroup-ng-11598-NodeInstanceRole-S6OPLB7TW3RR
  1. kfctl.sh init
$ cd ${KUBEFLOW_SRC}
$ ${KUBEFLOW_SRC}/scripts/kfctl.sh init ${KFAPP} --platform aws \
--awsClusterName ${AWS_CLUSTER_NAME} \
--awsRegion ${AWS_REGION} \
--awsNodegroupRoleNames ${AWS_NODEGROUP_ROLE_NAMES}

$ ls
deployment  kfapp  kubeflow  scripts
  1. Generate and apply the Kubernetes changes.
$ cd ${KFAPP}

# Generate the Kubernetes changes.
$ ${KUBEFLOW_SRC}/scripts/kfctl.sh generate k8s

# deploly changed kubernetes.
$ ${KUBEFLOW_SRC}/scripts/kfctl.sh apply k8s

Finished install kuberflow!!! 😍

Check namespace kubeflow pods. Waiting all pods Running finish.

$ kubectl get pods -n kubeflow
Tips. When delete kubeflow using kfctl.sh

${KUBEFLOW_SRC}/scripts/kfctl.sh delete k8s

Good Tips. Re-connected EKS

If you would like re-connected EKS(such as reconnected ssh terminal), fellow this.

$ sudo su
$ cd /tmp

$ export AWS_ACCESS_KEY_ID=<KEY>
$ export AWS_SECRET_ACCESS_KEY=<KEY>
$ aws eks --region ap-northeast-2 update-kubeconfig --name eks-cpu

# check kubernetes cluster
$ kubectl get nodes

Start Kubeflow DashBoard

$ kubectl port-forward -n kubeflow `kubectl get pods -n kubeflow --selector=service=ambassador -o jsonpath='{.items[0].metadata.name}'` 8080:80

# !! ssh tunneling using another terminal
$ ssh -i your_key.pem ubuntu@server-ip -L 8080:localhost:8080

Enter to http://127.0.0.1:8080.

Tips. (Option) Start Kubernetes Dashboard

# Deploy Kubernetes DashBoard
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml

# Deploy the heapster to monitor the container cluster and enable performance analysis of the cluster.
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/heapster.yaml

# Deploy an influxdb backend to the cluster for the heapster
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/influxdb.yaml

# Create Heapster Cluster Role Bindings for Dashboards
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/rbac/heapster-rbac.yaml

# Create eks-admin service account and cluster role binding
$ kubectl apply -f https://raw.githubusercontent.com/graykode/aws-kubeflow/master/eks-admin-service-account.yaml
# interlock Dashboard
$ kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep eks-admin | awk '{print $1}')

Write token string to login Kubernetes Dashboard.

# start Dashboard
$ kubectl proxy

# !! ssh tunneling using another terminal
$ ssh -i your_key.pem ubuntu@server-ip -L 8001:localhost:8001

Enter to http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/#!/login.

Run Example in kubeflow/example

github_issue_summarization

1. NoteBook

You can use kubeflow such as google colaboratory, Machine Learning Engineer don't know the cloud infrastructure, but they only need to use Jupyter notebook.

  1. Enter NoteBooks Tab.

  1. New Server: name with test. And Connect to Jupyter Notbook.

See kubectl get pods -n kubetl

NAME                                                        READY     STATUS    RESTARTS   AGE
..
test-0                                                      1/1       Running   0          15m
..
  1. News > Terminal
$ git clone https://github.com/kubeflow/examples

# install pip package
$ pip install pandas sklearn ktext matplotlib annoy nltk pydot

$ wget https://raw.githubusercontent.com/graykode/aws-kubeflow/master/Training.ipynb && \
   mv Training.ipynb examples/github_issue_summarization/notebooks
  1. Run Training.ipynb

2. Training the model using TFJob

  • TODO

3. Distributed Training using estimator and TFJob

  • TODO

Pipeline-dashboard

  1. Use [Sample] ML - TFX - Taxi Tip Prediction Model Trainer

  1. Set parameter Setting and run.

TODO

I will add more example after getting used to kuberflow! 🔨🔨

Don't Miss delete eks Cluster after used!!!!

$ eksctl delete cluster --name eks-cpu --region ap-northeast-2

Author

Reference

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].