All Projects → vithursant → Terraform Aws Spotgpu

vithursant / Terraform Aws Spotgpu

Licence: apache-2.0
Fully automated provisioning of AWS EC2 Spot Instances for Deep Learning workloads using Terraform.

Projects that are alternatives of or similar to Terraform Aws Spotgpu

Provisioning
Kubernetes cluster provisioning using Terraform.
Stars: ✭ 277 (+118.11%)
Mutual labels:  automation, terraform, hcl
Multi Env Deploy
Complete example of deploying complex web apps to AWS using Terraform, Ansible, and Packer
Stars: ✭ 132 (+3.94%)
Mutual labels:  aws-ec2, terraform, hcl
Aws Labs
step by step guide for aws mini labs. Currently maintained on : https://github.com/Cloud-Yeti/aws-labs Youtube playlist for labs:
Stars: ✭ 153 (+20.47%)
Mutual labels:  aws-ec2, terraform, hcl
Terraform
Terraform automation for Cloud
Stars: ✭ 121 (-4.72%)
Mutual labels:  automation, terraform, hcl
Ebs bckup
Stars: ✭ 32 (-74.8%)
Mutual labels:  automation, terraform, hcl
Curso Aws Com Terraform
🎦 🇧🇷 Arquivos do curso "DevOps: AWS com Terraform Automatizando sua infraestrutura" publicado na Udemy. Você pode me ajudar comprando o curso utilizando o link abaixo.
Stars: ✭ 62 (-51.18%)
Mutual labels:  aws-ec2, terraform, hcl
Infra Personal
Terraform for setting up my personal infrastructure
Stars: ✭ 45 (-64.57%)
Mutual labels:  automation, terraform, hcl
Terraform
Terraform - Beginners | Intermediate | Advanced
Stars: ✭ 77 (-39.37%)
Mutual labels:  automation, terraform, hcl
Aws Ecs Airflow
Run Airflow in AWS ECS(Elastic Container Service) using Fargate tasks
Stars: ✭ 107 (-15.75%)
Mutual labels:  terraform, hcl
Terraform Aws Config
Enables AWS Config and adds managed config rules with good defaults.
Stars: ✭ 107 (-15.75%)
Mutual labels:  terraform, hcl
Terraform Config
Terraform bits and bytes
Stars: ✭ 111 (-12.6%)
Mutual labels:  terraform, hcl
Terraform Aws Dynamic Subnets
Terraform module for public and private subnets provisioning in existing VPC
Stars: ✭ 106 (-16.54%)
Mutual labels:  terraform, hcl
Typhoon
Minimal and free Kubernetes distribution with Terraform
Stars: ✭ 1,397 (+1000%)
Mutual labels:  terraform, hcl
Terraform Aws Ecr
Terraform Module to manage Docker Container Registries on AWS ECR
Stars: ✭ 110 (-13.39%)
Mutual labels:  terraform, hcl
Aws Minikube
Single node Kubernetes instance implemented using Terraform and kubeadm
Stars: ✭ 101 (-20.47%)
Mutual labels:  terraform, hcl
Terraform Null Ansible
Terraform Module to run ansible playbooks
Stars: ✭ 114 (-10.24%)
Mutual labels:  terraform, hcl
Hello Lambda
🔥 An example of a Python (AWS) Lambda exposed with API Gateway, configured with Terraform.
Stars: ✭ 114 (-10.24%)
Mutual labels:  terraform, hcl
Cobalt
Infrastructure turn-key solution for app service workloads
Stars: ✭ 97 (-23.62%)
Mutual labels:  terraform, hcl
Terraform Up And Running Code
Code samples for the book "Terraform: Up & Running" by Yevgeniy Brikman
Stars: ✭ 1,739 (+1269.29%)
Mutual labels:  terraform, hcl
Hybrid multicloud overlay
MutiCloud_Overlay demonstrates a use case of overlay over one or more clouds such as AWS, Azure, GCP, OCI, Alibaba and a vSphere private infrastructure in Hub and spoke topology, point to point topology and in a Single cloud. Overlay protocols IPv6 and IPv4 are independent of underlying infrastructure. This solution can be integrated with encryption and additional security features.
Stars: ✭ 127 (+0%)
Mutual labels:  terraform, hcl

Provisioning AWS Spot Instances for Deep Learning using Terraform

Terraform is an open source tool developed by Hashicorp, which allows you to codify your infrastructure. This means that you can write configuration files, instead of having to click around in the AWS or any cloud provider's console. The files are in pure text format, and therefore can be shared, versioned, peer-reviewed just like any other code. Basically, Terraform is a tool that helps you with achieving Infrastructure as Code (IaC). The orchestration space is very green, but I think Terraform is the standout option.

This repository contains a terraform module for provisioning EC2-based Spot Instances on AWS, specifically for Deep Learning workloads on Amazon's GPU instances, by taking advantage of automation and friendly declarative configurations.

Development and testing was done on a macOS High Sierra version 10.13.3

Table of Contents

Requirements

Note: Terraform and aws-cli can be installed with brew install on Mac.

Configuration

AWS Key Pair

The instructions for creating an AWS Key Pair are here. This key needs be created in the corresponding AWS region you are working in. The name of the key pair has to be the same as the one listed in the AWS console. You will need to specify this in the my_key_pair_name variable (see Section Variables).

AWS Key Pair in the AWS Management Console.

Variables

This demo terraform script creates makes a Spot Instance request for a p2.xlarge in AWS and allows you to connect to a Jupyter notebook running on the server. This script could be more generic, but for now its only been tested on my own AWS setup, so I'm open to more contribution to the repo :)

In the variables.tf file some of the variables you can configure for your setup are:

    * my_region                 (default = us-east-1)       # N. Virginia
    * avail_zone                (default = us-east-1a)
    * my_key_pair_name          (default = my-test)
    * instance_type             (default = p2.xlarge)
    * num_instances             (default = 1)
    * spot_price                (default = 0.30)
    * ebs_volume_size           (default = 1)
    * ami_id                    (default = ami-dff741a0)    # AWS Deep Learning AMI (Ubuntu)

Note: The minimum spotPrice should follow the AWS EC2 Spot Instances Pricing, otherwise your request will not be fulfilled because the price is too low.

Amazon Machine Image

In this demo, I am using the AWS Deep Learning AMI, because its free and provides you with Anaconda environments for most of the popular DL frameworks (see image below). Also, the software cost is $0.00/hr, and you don't have to worry about installing the NVIDIA drivers and DL software (i.e. TensorFlow, PyTorch, MXNet, Caffe, Caffe2, etc) manually.

AWS Deep Learning AMI (Ubuntu) - a list of conda environments for deep learning frameworks optimized for CUDA/MKL.

Quick Start

  1. Configure your AWS Access Key, AWS Secret Access Key, and region name:
$ aws configure
AWS Access Key ID [None]: ********
AWS Secret Access Key [None]: ********
Default region name [None]: us-east-1
Default output format [None]: 
  1. Check to see if Terraform is installed properly:
$ terraform
  1. Initalize the working directory containing the Terraform configuration files:
$ terraform init
  1. Validate the syntax of the terraform files:
$ terraform validate
  1. Create the terraform execution plan, which is an easy way to check what actions are needed to be taken to get the desired state:
$ terraform plan
  1. Provision the instance(s) by applying the changes to get the desired state based on the plan:
$ terraform apply

Sample output showing requests for two p2.xlarge AWS EC2 Spot instances.
  1. Login to your EC2 Management Console and you should see your Spot Instance Request. You should also see all of the instances and volumes that were provisioned.

  2. Once done with the infrastructure, you can destroy it:

$ terraform destroy

Tips and Tricks

Debugging

  1. Step 3 in the Quick Start section allows you to view the output configurations in the terminal, but you can also save the execution plan for debugging purposes:
$ terraform plan -refresh=true -input=False -lock=true
-out=./proposed-changes.plan
  1. View the output from the .plan file in human-readable format:
$ terraform show proposed-changes.plan

Future Work

  • Terraform module for provisioning AWS On-Demand instances
  • Terraform module for setting up an AWS Elastic Container Service (ECS) cluster and run a service on the cluster

Other resources

Terraform

Deep Learning on AWS

Authors

Module is maintained by Vithursan Thangarasa

License

Apache License Version 2.0, January 2004 (See LICENSE for full details).

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].