Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → clusterdock → clusterdock

clusterdock / clusterdock

Licence: Apache-2.0 license

clusterdock is a framework for creating Docker-based container clusters

Programming Languages

139335 projects - #7 most used programming language

30231 projects

Labels

docker big-data hadoop

Projects that are alternatives of or similar to clusterdock

A large-scale entity and relation database supporting aggregation of properties

Stars: ✭ 1,642 (+6215.38%)

Mutual labels: big-data, hadoop

The official home of the Presto distributed SQL query engine for big data

Stars: ✭ 12,957 (+49734.62%)

Mutual labels: big-data, hadoop

Calcite Avatica

Mirror of Apache Calcite - Avatica

Stars: ✭ 130 (+400%)

Mutual labels: big-data, hadoop

HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS

Stars: ✭ 117 (+350%)

Mutual labels: big-data, hadoop

Information Inference Service of the OpenAIRE system

Stars: ✭ 16 (-38.46%)

Mutual labels: big-data, hadoop

Griffon Data Science Virtual Machine

Stars: ✭ 128 (+392.31%)

Mutual labels: big-data, hadoop

Spark With Python

Fundamentals of Spark with Python (using PySpark), code examples

Stars: ✭ 150 (+476.92%)

Mutual labels: big-data, hadoop

Docker Spark Cluster

A Spark cluster setup running on Docker containers

Stars: ✭ 57 (+119.23%)

Mutual labels: big-data, hadoop

RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark

Stars: ✭ 215 (+726.92%)

Mutual labels: big-data, hadoop

Apache Calcite

Stars: ✭ 2,816 (+10730.77%)

Mutual labels: big-data, hadoop

Apache Drill is a distributed MPP query layer for self describing data

Stars: ✭ 1,619 (+6126.92%)

Mutual labels: big-data, hadoop

rastercube is a python library for big data analysis of georeferenced time series data (e.g. MODIS NDVI)

Stars: ✭ 15 (-42.31%)

Mutual labels: big-data, hadoop

Asakusa Framework

Stars: ✭ 114 (+338.46%)

Mutual labels: big-data, hadoop

Movies-Analytics-in-Spark-and-Scala

Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.

Stars: ✭ 47 (+80.77%)

Mutual labels: big-data, hadoop

大数据入门指南 ⭐

Stars: ✭ 10,991 (+42173.08%)

Mutual labels: big-data, hadoop

Big Data Toolkit for the JVM

Stars: ✭ 140 (+438.46%)

Mutual labels: big-data, hadoop

Hadoop For Geoevent

ArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.

Stars: ✭ 5 (-80.77%)

Mutual labels: big-data, hadoop

MooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)

Stars: ✭ 1,025 (+3842.31%)

Mutual labels: big-data, hadoop

Bigdata Playground

A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL

Stars: ✭ 177 (+580.77%)

Mutual labels: big-data, hadoop

datalake-etl-pipeline

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

Stars: ✭ 39 (+50%)

Mutual labels: big-data, hadoop

View All Similar Projects ➔

clusterdock

Documentation Status

clusterdock is a Python 3 project that enables users to build, start, and manage Docker container-based clusters. It uses a pluggable system for defining new types of clusters using folders called topologies and is a swell project, if I may say so myself.

"I hate reading, make this quick."

Before doing anything, install a recent version of Docker to your machine and install clusterdock:

$ pip3 install clusterdock

Next, clone a clusterdock topology to your machine. For this example, we'll use the nodebase topology. You could start a 2-node cluster:

$ git clone https://github.com/clusterdock/topology_nodebase.git
$ clusterdock start topology_nodebase
2017-08-03 10:04:18 PM clusterdock.models   INFO     Starting cluster on network (cluster) ...
2017-08-03 10:04:18 PM clusterdock.models   INFO     Starting node node-1.cluster ...
2017-08-03 10:04:19 PM clusterdock.models   INFO     Starting node node-2.cluster ...
2017-08-03 10:04:20 PM clusterdock.models   INFO     Cluster started successfully (total time: 00:00:01.621).

To list cluster nodes:

$ clusterdock ps

For cluster `famous_hyades` on network cluster the node(s) are:
CONTAINER ID     HOST NAME            PORTS              STATUS        CONTAINER NAME          VERSION    IMAGE
a205d88beb       node-2.cluster                          running       nervous_sinoussi        1.3.3      clusterdock/topology_nodebase:centos6.6
6f2825c596       node-1.cluster       8080->80/tcp       running       priceless_franklin      1.3.3      clusterdock/topology_nodebase:centos6.6

To SSH into a node and look around:

$ clusterdock ssh node-1.cluster
[root@node-1 ~]# ls -l / | head
total 64
dr-xr-xr-x   1 root root 4096 May 19 20:48 bin
drwxr-xr-x   5 root root  360 Aug  4 05:04 dev
drwxr-xr-x   1 root root 4096 Aug  4 05:04 etc
drwxr-xr-x   2 root root 4096 Sep 23  2011 home
dr-xr-xr-x   7 root root 4096 Mar  4  2015 lib
dr-xr-xr-x   1 root root 4096 May 19 20:48 lib64
drwx------   2 root root 4096 Mar  4  2015 lost+found
drwxr-xr-x   2 root root 4096 Sep 23  2011 media
drwxr-xr-x   2 root root 4096 Sep 23  2011 mnt
[root@node-1 ~]# exit

To see full usage instructions for the start action, use -h/--help:

$ clusterdock start topology_nodebase -h
usage: clusterdock start [-h] [--node-disks map] [--always-pull]
                         [--namespace ns] [--network nw] [-o sys] [-r url]
                         [--nodes node [node ...]]
                         topology

Start a nodebase cluster

positional arguments:
  topology              A clusterdock topology directory

optional arguments:
  -h, --help            show this help message and exit
  --always-pull         Pull latest images, even if they're available locally
                        (default: False)
  --namespace ns        Namespace to use when looking for images (default:
                        clusterdock)
  --network nw          Docker network to use (default: cluster)
  -o sys, --operating-system sys
                        Operating system to use for cluster nodes (default:
                        centos6.6)
  -r url, --registry url
                        Docker Registry from which to pull images (default:
                        None)

nodebase arguments:
  --node-disks map      Map of node names to block devices (default: None)

Node groups:
  --nodes node [node ...]
                        Nodes of the nodes group (default: ['node-1',
                        'node-2'])

When you're done and want to clean up:

$ clusterdock manage nuke
2017-08-03 10:06:28 PM clusterdock.actions.manage INFO     Stopping and removing clusterdock containers ...
2017-08-03 10:06:30 PM clusterdock.actions.manage INFO     Removed user-defined networks ...

To see full usage instructions for the build action, use -h/--help:

$ clusterdock build topology_nodebase -h
usage: clusterdock build [--network nw] [-o sys] [--repository repo] [-h]
                         topology

Build images for the nodebase topology

positional arguments:
  topology              A clusterdock topology directory

optional arguments:
  --network nw          Docker network to use (default: cluster)
  -o sys, --operating-system sys
                        Operating system to use for cluster nodes (default:
                        None)
  --repository repo     Docker repository to use for committing images
                        (default: docker.io/clusterdock)
  -h, --help            show this help message and exit

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 26

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (8) 🔗