All Projects → eshvk → gcp-dl

eshvk / gcp-dl

Licence: MIT license
Deep Learning on GCP

Programming Languages

shell
77523 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to gcp-dl

Devops Bash Tools
550+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Kafka, Docker, APIs, Hadoop, SQL, PostgreSQL, MySQL, Hive, Impala, Travis CI, Jenkins, Concourse, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3, LDAP, Code/Build Linting, pkg mgmt for Linux, Mac, Python, Perl, Ruby, NodeJS, Golang, Advanced dotfiles: .bashrc, .vimrc, .gitconfig, .screenrc, .tmux.conf, .psqlrc ...
Stars: ✭ 226 (+737.04%)
Mutual labels:  gcp
quarkus-google-cloud-services
Google Cloud Services Quarkus Extensions
Stars: ✭ 42 (+55.56%)
Mutual labels:  gcp
gcpnatha
How to set up multiple NAT gateways with Equal Cost Multi-Path (ECMP) routing and autohealing enabled for a more resilient and high-bandwidth deployment using Deployment Manager.
Stars: ✭ 16 (-40.74%)
Mutual labels:  gcp
Rdbox
RDBOX is an advanced IT platform for robotics and IoT developers that highly integrates cloud-native and edge computing technologies.
Stars: ✭ 246 (+811.11%)
Mutual labels:  gcp
GCP
All files containing commands which can be used to complete GCP quests and challenge labs
Stars: ✭ 46 (+70.37%)
Mutual labels:  gcp
kafka-connect-fs
Kafka Connect FileSystem Connector
Stars: ✭ 107 (+296.3%)
Mutual labels:  gcp
Azure arc
Automated Azure Arc environments
Stars: ✭ 224 (+729.63%)
Mutual labels:  gcp
gcp auth
Minimal authentication library for Google Cloud Platform (GCP)
Stars: ✭ 42 (+55.56%)
Mutual labels:  gcp
cb-spider
CB-Spider provides a unified view and single interface for multi-cloud management.
Stars: ✭ 26 (-3.7%)
Mutual labels:  gcp
DevSecOps
Ultimate DevSecOps library
Stars: ✭ 4,450 (+16381.48%)
Mutual labels:  gcp
Engine
Deploy your apps on any Cloud provider in just a few seconds
Stars: ✭ 1,132 (+4092.59%)
Mutual labels:  gcp
prowler
Prowler is an Open Source Security tool for AWS, Azure and GCP to perform Cloud Security best practices assessments, audits, incident response, compliance, continuous monitoring, hardening and forensics readiness. It contains hundreds of controls covering CIS, PCI-DSS, ISO27001, GDPR, HIPAA, FFIEC, SOC2, AWS FTR, ENS and custom security frameworks.
Stars: ✭ 8,046 (+29700%)
Mutual labels:  gcp
hipaa-cloud-resources
HIPAA Cloud Resources -- a structured guide on usage of cloud services in the healthcare industry
Stars: ✭ 23 (-14.81%)
Mutual labels:  gcp
Komiser
☁️ Cloud Environment Inspector 👮🔒 💰
Stars: ✭ 2,684 (+9840.74%)
Mutual labels:  gcp
auth
A GitHub Action for authenticating to Google Cloud.
Stars: ✭ 567 (+2000%)
Mutual labels:  gcp
Cloud
The TensorFlow Cloud repository provides APIs that will allow to easily go from debugging and training your Keras and TensorFlow code in a local environment to distributed training in the cloud.
Stars: ✭ 229 (+748.15%)
Mutual labels:  gcp
CloudFrontier
Monitor the internet attack surface of various public cloud environments. Currently supports AWS, GCP, Azure, DigitalOcean and Oracle Cloud.
Stars: ✭ 102 (+277.78%)
Mutual labels:  gcp
DeployMachineLearningModels
This Repo Contains Deployment of Machine Learning Models on various cloud services like Azure, Heroku, AWS,GCP etc
Stars: ✭ 14 (-48.15%)
Mutual labels:  gcp
deploy-appengine
A GitHub Action that deploys source code to Google App Engine.
Stars: ✭ 184 (+581.48%)
Mutual labels:  gcp
sops-operator
A Kubernetes operator for Mozilla SOPS
Stars: ✭ 23 (-14.81%)
Mutual labels:  gcp

gcp-dl

Quickly and easily setup a cloud machine for Deep Learning Experimentation in GCP. The quickly, easily parts are WIP.

STEPS

Completely Script Based Approach

For example, if I want to run a one GPU, 4 CPU instance in us-east-d` with 1TB SSD bootdisk and install CUDA on it.

gcloud beta compute instances create eshvk-dl-fastai \
    --boot-disk-size=1TB --boot-disk-type= \
    --machine-type n1-standard-4 --zone us-east1-d \
    --accelerator type=nvidia-tesla-k80,count=1 \
    --image-family ubuntu-1604-lts --image-project ubuntu-os-cloud \
    --maintenance-policy TERMINATE --restart-on-failure \
    --metadata startup-script='#!/bin/bash
    echo "Checking for CUDA and installing."
    # Check for CUDA and try to install.
    if ! dpkg-query -W cuda; then
      curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
      dpkg -i ./cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
      apt-get update
      apt-get install cuda -y
    fi'
  • Connect to the instance and check if the CUDA driver has been installed by:
nvidia-smi

You should see something like this.

NOTE If the driver has not been installed, you will want to first check if the driver has been installed by the startup script. Do a tail -f /var/log/syslog. It does take a few minutes before that happens.

  • Both of these steps can be conveniently combined together like so:
gcloud beta compute instances create eshvk-dl-fastai \
    --boot-disk-size=1TB --boot-disk-type= \
    --machine-type n1-standard-4 --zone us-east1-d \
    --accelerator type=nvidia-tesla-k80,count=1 \
    --image-family ubuntu-1604-lts --image-project ubuntu-os-cloud \
    --maintenance-policy TERMINATE --restart-on-failure \
    --metadata-from-file startup-script=install-gpu.sh
  • Create a secondary SSD disk and mount it.
gcloud compute disks create eshvk-dl-fastai-disk --size 10TB --type pd-ssd --zone us-east1-d

gcloud compute instances attach-disk eshvk-dl-fastai --disk  eshvk-dl-fastai-disk --zone us-east1-d

  • SSH into the machine; Format the disk, mount it using the instructions here.

For example:

# Here sdb is the device ID I get from lsblk
sudo mkfs.ext4 -m 0 -F -E lazy_itable_init=0,lazy_journal_init=0,discard /dev/sdb
# Mount point
sudo mkdir -p /mnt/disks/persistent-data
# Mount disk
sudo mount -o discard,defaults /dev/sdb /mnt/disks/persistent-data
# Add an automatic mount for next time things start.
echo UUID=`sudo blkid -s UUID -o value /dev/sdb` /mnt/disks/persistent-data ext4 discard,defaults,nofail 0 2 | sudo tee -a /etc/fstab
  • Get GCS Service Key (to run notebook/jobs remotely) In order to run a notebook or jobs remotely, get a service key in the GCS console. Once you've downloaded this key, rename it google_service_key.json and move it to the root directory of the repository.

  • Copy the script user-install.sh to the gcloud instance like so:

gcloud compute scp user-install.sh eshvk-dl-fastai:~/user-install.sh  --zone us-east1-d
  • Copy the service key google_service_key.json over similarly.
gcloud compute scp google_service_key.json eshvk-dl-fastai:~/google_service_key.json  --zone us-east1-d
  • Copy the files auth_and_start.sh and lookup_value_from_json over.
gcloud compute scp auth_and_start.sh eshvk-dl-fastai:~/auth_and_start.sh  --zone us-east1-d
gcloud compute scp lookup_value_from_json eshvk-dl-fastai:~/lookup_value_from_json  --zone us-east1-d
  • SSH in, move the files auth_and_start.sh, lookup_value_from_json to /usr/local/bin.
sudo mv auth_and_start.sh /usr/local/bin
sudo mv lookup_value_from_json /usr/local/bin
  • Run the script using ./user-install.sh.

  • Firewall forwarding rules:

# this enables jupyter to talk to the external world.
gcloud compute firewall-rules create default-allow-jupyter --allow tcp:8888  --target-tags=allow-jupyter
# Add this to your instance
gcloud compute instances add-tags eshvk-dl-fastai --tags allow-jupyter --zone us-east1-d

  • Now SSH into the machine, do auth_and_start.sh jupyter notebook and log on on your browser with something like http://<external-ip>:8888. The auth_and_start.sh bit gets you authenticated with Google Magic.

Credits

This is based on fast.ai's course setup and easy-python-ml by ZacP.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].