Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → tobegit3hub → Openspark

tobegit3hub / Openspark

Licence: apache-2.0

The out-of-the-box environment to for Hadoop/Spark applications

Labels

dockerfile

OpenSpark

Introduction

The open environment to run any Spark applications.

Docker image to run in Linux/MacOS/Windows
Install Hadoop and Spark within one minute
Submit job to local and remote Yarn clusters
Repeatable examples of PySpark/Spark streaming/...

Usage

Run the container.

docker run -it --net=host -v /:/host tobegit3hub/openspark bash

Setup the cluster.

/scripts/prepare_sshd.sh
/scripts/prepare_hadoop.sh

Run the examples.

git clone https://github.com/tobegit3hub/spark_examples
cd ./spark_examples/dataframe_examples/
./create_dataframe_from_memory.sh

Examples

Run PySpark

/examples/read_json_dataframe.py

Submit Yarn Jobs

cd ./spark_examples/dataframe_examples/
./create_dataframe_from_memory.sh

If you are using yarn-client mode, make sure you have run container with --net=host and set spark.driver.host with host's IP for Spark session.


val conf = new SparkConf()
             .setMaster("yarn")
             .setAppName("foo")
             .set("spark.driver.host", "$HOST_IP")

Hadoop Cluster

We can use external Hadoop cluster by copying its configuration files in the container.

cp /host/my_hadoop_conf/* /usr/local/hadoop/etc/hadoop/

Spark Version

OpenSpark have supported multiple Spark versions and you can add more in the docker image.

spark-2.4.0-bin-hadoop2.7
spark-2.3.0-bin-hadoop2.7(default)
spark-1.6.3-bin-hadoop2.6

We can switch the Spark version easily.

echo $SPARK_HOME

mv /usr/local/spark/ /usr/local/spark_bak/

mv /usr/local/spark-1.6.3-bin-hadoop2.6/ /usr/local/spark/

export PATH=$PATH:$SPARK_HOME/bin

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 40

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗