All Projects → onanypoint → Yandex Big Data Engineering

onanypoint / Yandex Big Data Engineering

Projects that are alternatives of or similar to Yandex Big Data Engineering

Big Data Engineering Coursera Yandex
Big Data for Data Engineers Coursera Specialization from Yandex
Stars: ✭ 71 (+317.65%)
Mutual labels:  jupyter-notebook, spark, mapreduce, hdfs
Repository
个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (+441.18%)
Mutual labels:  spark, mapreduce, hdfs
Bigdata Interview
🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+4941.18%)
Mutual labels:  spark, mapreduce, hdfs
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+782.35%)
Mutual labels:  jupyter-notebook, spark, hdfs
Bigdata Notes
大数据入门指南 ⭐
Stars: ✭ 10,991 (+64552.94%)
Mutual labels:  spark, mapreduce, hdfs
Bigdata docker
Big Data Ecosystem Docker
Stars: ✭ 161 (+847.06%)
Mutual labels:  jupyter-notebook, spark, hdfs
Helk
The Hunting ELK
Stars: ✭ 3,097 (+18117.65%)
Mutual labels:  jupyter-notebook, spark
Zat
Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark
Stars: ✭ 303 (+1682.35%)
Mutual labels:  jupyter-notebook, spark
Enterprise gateway
A lightweight, multi-tenant, scalable and secure gateway that enables Jupyter Notebooks to share resources across distributed clusters such as Apache Spark, Kubernetes and others.
Stars: ✭ 412 (+2323.53%)
Mutual labels:  jupyter-notebook, spark
Data Science Ipython Notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+129594.12%)
Mutual labels:  spark, mapreduce
bigdata-fun
A complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-17.65%)
Mutual labels:  spark, hdfs
Agile data code 2
Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+2329.41%)
Mutual labels:  jupyter-notebook, spark
Bdp Dataplatform
大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Stars: ✭ 456 (+2582.35%)
Mutual labels:  spark, mapreduce
Spark Jupyter Aws
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Stars: ✭ 259 (+1423.53%)
Mutual labels:  jupyter-notebook, spark
bigkube
Minikube for big data with Scala and Spark
Stars: ✭ 16 (-5.88%)
Mutual labels:  spark, hdfs
Devops Python Tools
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+2288.24%)
Mutual labels:  spark, hdfs
data-algorithms-with-spark
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
Stars: ✭ 34 (+100%)
Mutual labels:  spark, mapreduce
God Of Bigdata
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+35241.18%)
Mutual labels:  spark, hdfs
Sparta
Real Time Analytics and Data Pipelines based on Spark Streaming
Stars: ✭ 513 (+2917.65%)
Mutual labels:  spark, hdfs
Cdap
An open source framework for building data analytic applications.
Stars: ✭ 509 (+2894.12%)
Mutual labels:  spark, mapreduce
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].