onanypoint / Yandex Big Data Engineering
Stars: ✭ 17
Projects that are alternatives of or similar to Yandex Big Data Engineering
Big Data Engineering Coursera Yandex
Big Data for Data Engineers Coursera Specialization from Yandex
Stars: ✭ 71 (+317.65%)
Mutual labels: jupyter-notebook, spark, mapreduce, hdfs
Repository
个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (+441.18%)
Mutual labels: spark, mapreduce, hdfs
Bigdata Interview
🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+4941.18%)
Mutual labels: spark, mapreduce, hdfs
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+782.35%)
Mutual labels: jupyter-notebook, spark, hdfs
Bigdata docker
Big Data Ecosystem Docker
Stars: ✭ 161 (+847.06%)
Mutual labels: jupyter-notebook, spark, hdfs
Zat
Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark
Stars: ✭ 303 (+1682.35%)
Mutual labels: jupyter-notebook, spark
Enterprise gateway
A lightweight, multi-tenant, scalable and secure gateway that enables Jupyter Notebooks to share resources across distributed clusters such as Apache Spark, Kubernetes and others.
Stars: ✭ 412 (+2323.53%)
Mutual labels: jupyter-notebook, spark
Data Science Ipython Notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+129594.12%)
Mutual labels: spark, mapreduce
bigdata-fun
A complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-17.65%)
Mutual labels: spark, hdfs
Agile data code 2
Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+2329.41%)
Mutual labels: jupyter-notebook, spark
Bdp Dataplatform
大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Stars: ✭ 456 (+2582.35%)
Mutual labels: spark, mapreduce
Spark Jupyter Aws
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Stars: ✭ 259 (+1423.53%)
Mutual labels: jupyter-notebook, spark
Devops Python Tools
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+2288.24%)
Mutual labels: spark, hdfs
data-algorithms-with-spark
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
Stars: ✭ 34 (+100%)
Mutual labels: spark, mapreduce
God Of Bigdata
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+35241.18%)
Mutual labels: spark, hdfs
Sparta
Real Time Analytics and Data Pipelines based on Spark Streaming
Stars: ✭ 513 (+2917.65%)
Mutual labels: spark, hdfs
Cdap
An open source framework for building data analytic applications.
Stars: ✭ 509 (+2894.12%)
Mutual labels: spark, mapreduce
Big Data for Data Engineers Specialization
Build Your Data Engineering Skills. Learn how to tame the big data beast with the most popular tools assisted by top-notch practitioners
Courses
Big Data Essentials HDFS, MapReduce and Spark
Introduction to HDFS, MapReduce and Spark and their system internals. Help understand the MapReduce framework and exercises to process texts.
- Week 1 Demo Assignment
- Hadoop Streaming assignment 0: Word Count
- Hadoop Streaming assignment 1: Words Rating
- Hadoop Streaming assignment 2: Stop Words
- Spark assignment 1: Pairs
- Reconstructing the path
- Real-World Applications: TF-IDF
Honors Assignments
- Hadoop Streaming assignment 3: Name Count
- Hadoop Streaming assignment 4: Word Groups
- Spark assignment 2: Collocations
Big Data Analysis Hive, Spark SQL, DataFrames and GraphFrames
- Week 2 Demo Assignment
- Hive Assignment 1. DDL: Create Tables
- Hive Assignment 2. DML: Find Most Popular Tags
- Week 4 Demo Assignment
- Counting number of the mutual friends
- Week 5 Demo Assignment
- Graph based Music Recommender. Task 1
- Graph based Music Recommender. Task 2
- Graph based Music Recommender. Task 3
- Graph based Music Recommender. Task 4
- Week 6 Demo Assignment
- Breadth-first search in Spark SQL
Honors Assignments
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].