Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

A lightweight, multi-tenant, scalable and secure gateway that enables Jupyter Notebooks to share resources across distributed clusters such as Apache Spark, Kubernetes and others.

Stars: ✭ 412 (+2323.53%)

Mutual labels: jupyter-notebook, spark

Data Science Ipython Notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Stars: ✭ 22,048 (+129594.12%)

Mutual labels: spark, mapreduce

bigdata-fun

A complete (distributed) BigData stack, running in containers

Stars: ✭ 14 (-17.65%)

Mutual labels: spark, hdfs

Agile data code 2

Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition

Stars: ✭ 413 (+2329.41%)

Mutual labels: jupyter-notebook, spark

Bdp Dataplatform

大数据生态解决方案数据平台：基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。

Stars: ✭ 456 (+2582.35%)

Mutual labels: spark, mapreduce

Spark Jupyter Aws

A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support

Stars: ✭ 259 (+1423.53%)

Mutual labels: jupyter-notebook, spark

bigkube

Minikube for big data with Scala and Spark

Stars: ✭ 16 (-5.88%)

Mutual labels: spark, hdfs

Devops Python Tools

80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.

Stars: ✭ 406 (+2288.24%)

Mutual labels: spark, hdfs

data-algorithms-with-spark

O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian

Stars: ✭ 34 (+100%)

Mutual labels: spark, mapreduce

God Of Bigdata

专注大数据学习面试，大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...

Stars: ✭ 6,008 (+35241.18%)

Mutual labels: spark, hdfs

Sparta

Real Time Analytics and Data Pipelines based on Spark Streaming

Stars: ✭ 513 (+2917.65%)

Mutual labels: spark, hdfs

Cdap

An open source framework for building data analytic applications.

Stars: ✭ 509 (+2894.12%)

Mutual labels: spark, mapreduce

View All Similar Projects ➔

Big Data for Data Engineers Specialization

Coursera Speclialization

Build Your Data Engineering Skills. Learn how to tame the big data beast with the most popular tools assisted by top-notch practitioners

Courses

Big Data Essentials HDFS, MapReduce and Spark

Introduction to HDFS, MapReduce and Spark and their system internals. Help understand the MapReduce framework and exercises to process texts.

Honors Assignments

Big Data Analysis Hive, Spark SQL, DataFrames and GraphFrames

Honors Assignments

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 17

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗