All Projects → Comcast → Kube Yarn

Comcast / Kube Yarn

Licence: apache-2.0
Running YARN on Kubernetes with PetSet controller.

Labels

Projects that are alternatives of or similar to Kube Yarn

Makepytorchplusplus
How and why you want to make your pytorch CUDA/CPP extension with a Makefile
Stars: ✭ 142 (-5.33%)
Mutual labels:  makefile
Proprietary vendor lge
Stars: ✭ 143 (-4.67%)
Mutual labels:  makefile
Ansibleplaybook Rancher
Ansible playbook to keep a Rancher container running on a docker host
Stars: ✭ 146 (-2.67%)
Mutual labels:  makefile
Ansible Cloud Hosting
Set of scripts to quickly build private cloud hosting.
Stars: ✭ 141 (-6%)
Mutual labels:  makefile
Rbcurse
ruby based curses widgets: fields, buttons, textarea. menus, message boxes, tabbed panes, tables, listboxes, splitpanes, scrollpanes. Event based, MVC architecture. (DEPRECATED - PLS USE rbcurse-core)
Stars: ✭ 142 (-5.33%)
Mutual labels:  makefile
Proprietary vendor motorola
Stars: ✭ 143 (-4.67%)
Mutual labels:  makefile
Letsgo
Golang 小视频教程,零散时间学习 Golang
Stars: ✭ 139 (-7.33%)
Mutual labels:  makefile
Real World Kubernetes
disaster recovery, security, and high-availability setups for kubernetes tutorials
Stars: ✭ 149 (-0.67%)
Mutual labels:  makefile
Gb
A minimal C implementation of Nintendo Gameboy - An fast research environment for Reinforcement Learning
Stars: ✭ 143 (-4.67%)
Mutual labels:  makefile
Keyboard Pcbs
Keyboard PCBs
Stars: ✭ 146 (-2.67%)
Mutual labels:  makefile
Reggae
Build system in D, Python, Ruby, Javascript or Lua
Stars: ✭ 141 (-6%)
Mutual labels:  makefile
Sphinxtr
The Sphinx Thesis Resource
Stars: ✭ 142 (-5.33%)
Mutual labels:  makefile
Smarchglasses public
Stars: ✭ 143 (-4.67%)
Mutual labels:  makefile
K8s Vagrant Multi Node
A Kubernetes Vagrant Multi node environment using kubeadm.
Stars: ✭ 141 (-6%)
Mutual labels:  makefile
Securekubernetes
Attacking and Defending Kubernetes Clusters: A Guided Tour
Stars: ✭ 146 (-2.67%)
Mutual labels:  makefile
Developer
developer roadmap. use growth https://github.com/phodal/growth replace this
Stars: ✭ 1,699 (+1032.67%)
Mutual labels:  makefile
Terraform Makefile
Helps me actually use terraform for multiple environments
Stars: ✭ 143 (-4.67%)
Mutual labels:  makefile
Sd Maskrcnn
Code for SD Mask R-CNN Project
Stars: ✭ 150 (+0%)
Mutual labels:  makefile
Pokemon Svg
SVGs for Pokemon
Stars: ✭ 147 (-2%)
Mutual labels:  makefile
Android Libjpeg Turbo
Standalone Android.mk configuration for libjpeg-turbo.
Stars: ✭ 145 (-3.33%)
Mutual labels:  makefile

YARN on Kubernetes

Tags and Dockerfiles

Docker image for kube-yarn

Image for deploying the kube-yarn artifacts without any local dependencies.

Example usage:

Invoke the default make target to deploy the full stack using the kubeconfig from your current home directory.

docker run -it --rm -v ${HOME}/.kube/config:/root/.kube/config:ro danisla/kube-yarn:latest

Make sure to mount any additional volumes for files referenced within the kube config.

To remove the resources by invoking the clean target:

docker run -it --rm -v ${HOME}/.kube/config:/root/.kube/config:ro danisla/kube-yarn:latest clean

Architecture Diagram

Architecture

StatefulSet Overview

The hadoop components are boostrapped using files from a ConfigMap to provide the init script and config xml files. This allows users to fully customize their distribution for their use cases.

The ACP env var can also be set to a comma-separated list of URLs to be downloaded and added to the classpath at runtime.

Logs from the ${HADOOP_PREFIX}/logs directory are tailed so that they can be viewed by attaching to the container.

hdfs-nn - HDFS Name Node

The namenode daemon runs in this pod container.

Currently, only 1 namenode is supported (no HA w/Zookeeper or secondary namenode).

hdfs-dn - HDFS Data Node

The datanode daemon runs in this pod container.

There can be 1 or more of these, scaled by changing the number of replicas in the spec.

yarn-rm - YARN Resource Manager

The resource manager daemon runs in this pod container.

Currently, only 1 resource manager is supported (no HA w/Zookeeper).

The WebUI can be accessed using the service port 8088 or at localhost:8088 after running make pf.

yarn-nm - YARN Node Manager

The node manager daemon runs in this pod container.

There can be 1 or more of these, scaled by changing the number of replicas in the pod spec.

The amount of vcores and memory registered with the resource manager is reflected and infered from the pod spec resources using the Downward API. These are made available to the container via the env vars MY_CPU_LIMIT and MY_MEM_LIMIT respectively and then added to the yarn-site.xml at runtime using the bootstrap script.

zeppelin - Zeppelin Notebook

The Zeppelin notebook is run in this pod container and can be used to run Spark jobs or hadoop jobs using the %sh shell interpreter.

The K8S yarn cluster config is mounted from the same ConfigMap over the top of the Zeppelin image dir to give it access to the cluster without modifying the base image.

The Zeppelin web app can be accessed using the service port 8080 or at localhost:8081 after running make pf.

Running locally

This repo uses minikube to start a k8s cluster locally.

The Makefile contains targets for starting the cluster and helper targets for kubectl to apply the K8S manifests and interact with the pods.

When running locally with minikube, make sure your VM has enough resources, you should set the number of cpus to 8 and memory to 8192. If not set, the pods won't have enough resources to fully start and will be stuck in the Pending creation phase.

Starting minikube manually:

minikube start --cpus 8 --memory 8192

Or, start with the Makefile which will download minikube and start the cluster:

make minikube

Start the YARN cluster:

make

This will create all of the components for the cluster.

Run this to create port forwards to localhost:

make pf

You should now be able to access the following:

  • YARN WebUI: http://localhost:8088
  • Zeppelin: http://localhost:8081

Full stack test

Test hdfs, yarn and mapred using TestDFSIO, submitted from one of the node managers:

make test

Which runs this command on yarn-nm-0:

/usr/local/hadoop/bin/hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.6.0-tests.jar TestDFSIO -write -nrFiles 5 -fileSize 128MB -resFile /tmp/TestDFSIOwrite.txt

Spark on YARN in Zeppelin

In your browser, go to Zeppelin at: http://localhost:8081

Create a new note and run this in a paragraph:

sc.parallelize(1 to 1000).count

Press shift-enter to execute the paragraph

The first command executed creates the spark job on yarn and will take a few seconds, then you should get the result 1000 when complete.

Make targets:

init

Create the namespace, configmaps service account and hosts-disco service.

create-apps

Creates hdfs, yarn and zeppelin apps.

pf

Creates local port forwards for yarn and zeppelin.

dfsreport

Gets the state of HDFS.

get-yarn-nodes

Lists the registered yarn nodes by executing this in the node manager pod: yarn node -list

shell-hdfs-nn-0

Drops into a shell on the namenode

shell-yarn-rm-0

Drops into a shell on the resource manager.

shell-zeppelin-0

Drops into a shell on the zeppelin container.

Shutting down

make clean

Shutdown the cluster

minikube stop

Or:

make stop-minikube
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].