Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Stars: ✭ 22,048 (+1858.08%)

Mutual labels: spark, big-data

Metorikku

A simplified, lightweight ETL Framework based on Apache Spark

Stars: ✭ 361 (-67.94%)

Mutual labels: spark, big-data

bigdata-fun

A complete (distributed) BigData stack, running in containers

Stars: ✭ 14 (-98.76%)

Mutual labels: big-data, spark

Spark Movie Lens

An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset

Stars: ✭ 745 (-33.84%)

Mutual labels: spark, big-data

Succinct

Enabling queries on compressed data.

Stars: ✭ 257 (-77.18%)

Mutual labels: spark, big-data

Listenbrainz Server

Server for the ListenBrainz project

Stars: ✭ 420 (-62.7%)

Mutual labels: spark, big-data

awesome-AI-kubernetes

❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc

Stars: ✭ 95 (-91.56%)

Mutual labels: big-data, spark

H2o 3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

Stars: ✭ 5,656 (+402.31%)

Mutual labels: spark, big-data

Zeppelin

Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.

Stars: ✭ 5,513 (+389.61%)

Mutual labels: spark, big-data

aut

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.

Stars: ✭ 111 (-90.14%)

Mutual labels: big-data, spark

Spark

Apache Spark - A unified analytics engine for large-scale data processing

Stars: ✭ 31,618 (+2707.99%)

Mutual labels: spark, big-data

leaflet heatmap

简单的可视化湖州通话数据假设数据量很大，没法用浏览器直接绘制热力图，把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后，再使用Apache Spark绘制热力图，然后用leafletjs加载OpenStreetMap图层和热力图图层，以达到良好的交互效果。现在使用Apache Spark实现绘制，可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法，并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .

Stars: ✭ 13 (-98.85%)

Mutual labels: big-data, spark

Delta

An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.

Stars: ✭ 3,903 (+246.63%)

Mutual labels: spark, big-data

spark-acid

ACID Data Source for Apache Spark based on Hive ACID

Stars: ✭ 91 (-91.92%)

Mutual labels: big-data, spark

Bigdl

Building Large-Scale AI Applications for Distributed Big Data

Stars: ✭ 3,813 (+238.63%)

Mutual labels: spark, big-data

Pyspark Cheatsheet

🐍 Quick reference guide to common patterns & functions in PySpark.

Stars: ✭ 108 (-90.41%)

Mutual labels: documentation, spark

Storm Doc Zh

Apache Storm 官方文档中文版

Stars: ✭ 142 (-87.39%)

Mutual labels: documentation, big-data

Magellan

Geo Spatial Data Analytics on Spark

Stars: ✭ 507 (-54.97%)

Mutual labels: spark, big-data

Sparkjni

A heterogeneous Apache Spark framework.

Stars: ✭ 11 (-99.02%)

Mutual labels: spark, big-data

View All Similar Projects ➔

Apache Spark 官方文档中文版

Apache Spark? 是一个快速的，用于海量数据处理的通用引擎。

任何一个傻瓜都会写能够让机器理解的代码，只有好的程序员才能写出人类可以理解的代码。——Martin Fowler

历史版本

Apache Spark 2.0.2 官方文档中文版
- 中文文档 EPUB 格式

翻译进度

贡献指南

翻译 Spark 2.4.4

贡献者: 记得留言和更新翻译进度
地址: https://github.com/apachecn/spark-doc-zh/issues/189

项目看板

项目 Spark 2.4.4 看板

负责人: 记得更新和优化
地址: https://github.com/apachecn/spark-doc-zh/projects/1

贡献指南

请见这里

项目负责人

格式: GitHub + QQ

第一期 (2016-10-31)

贡献者名单：http://cwiki.apachecn.org/pages/viewpage.action?pageId=2887089

第二期 (2018-12-05)

@wangyangting（那伊抹微笑）
@jiangzhonglian（片刻）
@chenyyx（Joy yx）
@XiaoLiz（VoLi）
@ruilintian（ruilintian）
@huangtianan（huangtianan）
@kris37（kris37）
@sehriff（sehriff）
@windyqinchaofeng（qinchaofeng）
@stealthsMrs（stealthsMrs）

第三期 (2019-06-21)

@965: 1097828409
@loserman: 1015818189

-- 负责人要求: (欢迎一起为 SPark 中文版本 做贡献)

热爱开源，喜欢装逼
长期使用 Spark(至少0.5年)
能够有时间及时优化页面bug和用户issues
试用期: 2个月
欢迎联系: @965: 1097828409

联系方式

有任何建议反馈，或想参与文档翻译，麻烦联系下面的企鹅:

企鹅：1097828409

建议反馈

在我们的 apachecn/spark-doc-zh github 上提 issue.
发邮件到 Email: [email protected].
在我们的 QQ群-搜索: 交流方式中联系群主/管理员即可.

License

下载

Docker

docker pull apachecn0/spark-doc-zh
docker run -tid -p <port>:80 apachecn0/spark-doc-zh
# 访问 http://localhost:{port} 查看文档

PYPI

pip install spark-doc-zh
spark-doc-zh <port>
# 访问 http://localhost:{port} 查看文档

NPM

npm install -g spark-doc-zh
spark-doc-zh <port>
# 访问 http://localhost:{port} 查看文档

赞助我们

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 1,126

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (43) 🔗