Pushkr / Apache Spark Hands On
Educational notes,Hands on problems w/ solutions for hadoop ecosystem
Stars: ✭ 74
Programming Languages
python
139335 projects - #7 most used programming language
Projects that are alternatives of or similar to Apache Spark Hands On
Bigdataguide
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (+1004.05%)
Mutual labels: spark, hadoop, bigdata, hive
Hadoopcryptoledger
Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Stars: ✭ 126 (+70.27%)
Mutual labels: spark, hadoop, bigdata, hive
God Of Bigdata
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+8018.92%)
Mutual labels: spark, hadoop, bigdata, hive
Sparkrdma
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (+190.54%)
Mutual labels: spark, hadoop, bigdata
the-apache-ignite-book
All code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above
Stars: ✭ 65 (-12.16%)
Mutual labels: hive, hadoop, bigdata
Bigdata Interview
🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+1058.11%)
Mutual labels: spark, hadoop, bigdata
dockerfiles
Multi docker container images for main Big Data Tools. (Hadoop, Spark, Kafka, HBase, Cassandra, Zookeeper, Zeppelin, Drill, Flink, Hive, Hue, Mesos, ... )
Stars: ✭ 29 (-60.81%)
Mutual labels: hive, hadoop, bigdata
swordfish
Open-source distribute workflow schedule tools, also support streaming task.
Stars: ✭ 35 (-52.7%)
Mutual labels: spark, hive, hadoop
Javaorbigdata Interview
Java开发者或者大数据开发者面试知识点整理
Stars: ✭ 203 (+174.32%)
Mutual labels: spark, hadoop, bigdata
leaflet heatmap
简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-82.43%)
Mutual labels: spark, hadoop, bigdata
Spline
Data Lineage Tracking And Visualization Solution
Stars: ✭ 306 (+313.51%)
Mutual labels: spark, hadoop, bigdata
hadoopoffice
HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
Stars: ✭ 56 (-24.32%)
Mutual labels: hive, hadoop, bigdata
Wedatasphere
WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!
Stars: ✭ 372 (+402.7%)
Mutual labels: spark, hadoop, hive
For the benefit of community, Please feel free to add/request anything that hasnt been covered. Please remember this is beginners guide and not a expert level documentation.
Hadoop
-
/Flume
: contains notes and examples of apache flume -
/Hive
: contains notes and examples of apache hive -
/MySQL
: code sample containing peices to create db, create table and load data in mysql -
/Sqoop
: contains notes and examples of import/export using sqoop -
/spark
: contains notes,documentation, sample example(s) of spark APIs
Hands-on :
-
/exam
: sample cca-175 exam questions and solutions (in solution branch) -
/problem1
- complex data structure handling using hive. (exposure to Hive,create table,LOAD,named_struct,struct) -
/problem2
- Stock data analysis. (exposure to : json file handing, SparkSQL,map,reduce,filter,join,groupByKey,keyBy,UDFs etc) -
/problem3
- MovieLens database analysis -
/problem4
- Lahman's baseball database analysis -
/problem5
- Hortonworks certification sample. Total 10 tasks . -
/Tweeter
- Tweeter data analysis -
/problem6
- Retail database sample excercises
Link
My Answers to few PySpark Questions on StackOverFlow :Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].