All Projects → dask → dask-ec2

dask / dask-ec2

Licence: other
Start a cluster in EC2 for dask.distributed

Programming Languages

python
139335 projects - #7 most used programming language
Jupyter Notebook
11667 projects
SaltStack
118 projects

Labels

Projects that are alternatives of or similar to dask-ec2

prefect-saturn
Python client for using Prefect Cloud with Saturn Cloud
Stars: ✭ 15 (-85.44%)
Mutual labels:  dask
dvc dask use case
A use case of a reproducible machine learning pipeline using Dask, DVC, and MLflow.
Stars: ✭ 22 (-78.64%)
Mutual labels:  dask
dask-rasterio
Read and write rasters in parallel using Rasterio and Dask
Stars: ✭ 82 (-20.39%)
Mutual labels:  dask
bumblebee
🚕 A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)
Stars: ✭ 120 (+16.5%)
Mutual labels:  dask
mloperator
Machine Learning Operator & Controller for Kubernetes
Stars: ✭ 85 (-17.48%)
Mutual labels:  dask
arboreto
A scalable python-based framework for gene regulatory network inference using tree-based ensemble regressors.
Stars: ✭ 33 (-67.96%)
Mutual labels:  dask
daskperiment
Reproducibility for Humans: A lightweight tool to perform reproducible machine learning experiment.
Stars: ✭ 25 (-75.73%)
Mutual labels:  dask
HyperGBM
A full pipeline AutoML tool for tabular data
Stars: ✭ 172 (+66.99%)
Mutual labels:  dask
esmlab
Earth System Model Lab (esmlab). ⚠️⚠️ ESMLab functionality has been moved into <https://github.com/NCAR/geocat-comp>. ⚠️⚠️
Stars: ✭ 23 (-77.67%)
Mutual labels:  dask
gaia
Gaia is a geospatial analysis library jointly developed by Kitware and Epidemico.
Stars: ✭ 29 (-71.84%)
Mutual labels:  dask
dask-pytorch-ddp
dask-pytorch-ddp is a Python package that makes it easy to train PyTorch models on dask clusters using distributed data parallel.
Stars: ✭ 50 (-51.46%)
Mutual labels:  dask
dask-awkward
Native Dask collection for awkward arrays, and the library to use it.
Stars: ✭ 25 (-75.73%)
Mutual labels:  dask
madpy-dask
MadPy Dask talk materials
Stars: ✭ 33 (-67.96%)
Mutual labels:  dask
mlforecast
Scalable machine 🤖 learning for time series forecasting.
Stars: ✭ 96 (-6.8%)
Mutual labels:  dask
flox
Fast & furious GroupBy operations for dask.array
Stars: ✭ 42 (-59.22%)
Mutual labels:  dask
codex-africanus
Radio Astronomy Algorithms Library
Stars: ✭ 13 (-87.38%)
Mutual labels:  dask
datatile
A library for managing, validating, summarizing, and visualizing data.
Stars: ✭ 419 (+306.8%)
Mutual labels:  dask
knit
Deprecated, please use https://github.com/jcrist/skein or https://github.com/dask/dask-yarn instead
Stars: ✭ 53 (-48.54%)
Mutual labels:  dask
framequery
SQL on dataframes - pandas and dask
Stars: ✭ 63 (-38.83%)
Mutual labels:  dask
php-uavt-adreskodu-botu
Php ile uavt adres kodu botu
Stars: ✭ 2 (-98.06%)
Mutual labels:  dask

ARCHIVED

As of November 3rd, 2020 this respository is now archived. Please consult the Dask Cloud docs page for more information on deploying Dask with cloud resources.

Dask EC2 Build Status Coverage Status

Easily launch a cluster on Amazon EC2 configured with dask.distributed, Jupyter Notebooks, and Anaconda.

Installation

You also install dask-ec2 using pip:

$ pip install dask-ec2

You can also install dask-ec2 and its dependencies from the conda-forge repository using conda:

$ conda install dask-ec2 -c conda-forge

Usage

Note: dask-ec2 uses boto3 to interact with Amazon EC2. You can configure your AWS credentials using Environment Variables or Configuration Files.

The dask-ec2 up command can be used to create and provision a cluster on Amazon EC2:

$ dask-ec2 up --help
Usage: dask-ec2 up [OPTIONS]

Options:
  --keyname TEXT                Keyname on EC2 console  [required]
  --keypair PATH                Path to the keypair that matches the keyname
                                [required]
  --name TEXT                   Tag name on EC2
  --tags TEXT                   Additional EC2 tags.  Comma separated K:V
                                pairs: K1:V1,K2:V2
  --region-name TEXT            AWS region  [default: us-east-1]
  --vpc-id TEXT                 EC2 VPC ID
  --subnet-id TEXT              EC2 Subnet ID on the VPC
  --iaminstance-name TEXT       IAM Instance Name
  --ami TEXT                    EC2 AMI  [default: ami-d05e75b8]
  --username TEXT               User to SSH to the AMI  [default: ubuntu]
  --type TEXT                   EC2 Instance Type  [default: m3.2xlarge]
  --count INTEGER               Number of nodes  [default: 4]
  --security-group TEXT         Security Group Name  [default: dask-ec2-default]
  --security-group-id TEXT      Security Group ID (overwrites Security Group
                                Name)
  --volume-type TEXT            Root volume type  [default: gp2]
  --volume-size INTEGER         Root volume size (GB)  [default: 500]
  --file PATH                   File to save the metadata  [default:
                                cluster.yaml]
  --provision / --no-provision  Provision salt on the nodes  [default: True]
  --anaconda / --no-anaconda    Bootstrap anaconda  [default: True]
  --dask / --no-dask            Install Dask.Distributed in the cluster
                                [default: True]
  --notebook / --no-notebook    Start a Jupyter Notebook in the head node
                                [default: True]
  --nprocs INTEGER              Number of processes per worker  [default: 1]
  --source / --no-source        Install Dask/Distributed from git master
                                [default: False]
  -h, --help                    Show this message and exit.

The minimal required arguments for the dask-ec2 up command are:

$ dask-ec2 up --keyname my_aws_key --keypair ~/.ssh/my_aws_key.pem

This will create a cluster.yaml in the directory that it was executed, and this file is required to use the other commands in the CLI.

Once a cluster is running, the dask-ec2 command can be used to create or destroy a cluster, ssh into nodes, or other functions:

$ dask-ec2
Usage: dask-ec2 [OPTIONS] COMMAND [ARGS]...

Options:
  --version   Show the version and exit.
  -h, --help  Show this message and exit.

Commands:
  anaconda          Provision anaconda
  dask-distributed  dask.distributed option
  destroy           Destroy cluster
  notebook          Provision the Jupyter notebook
  provision         Provision salt instances
  ssh               SSH to one of the node. 0-index
  up                Launch instances
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].