ScriptisScriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Stars: ✭ 696 (+544.44%)
Mutual labels: spark, pyspark
SparkmagicJupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+783.33%)
Mutual labels: spark, pyspark
Spark Tdd ExampleA simple Spark TDD example
Stars: ✭ 23 (-78.7%)
Mutual labels: spark, pyspark
basinBasin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-76.85%)
Mutual labels: spark, pyspark
Devops Python Tools80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+275.93%)
Mutual labels: spark, pyspark
Live log analyzer sparkSpark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.
Stars: ✭ 14 (-87.04%)
Mutual labels: spark, pyspark
incubator-linkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,459 (+2176.85%)
Mutual labels: spark, pyspark
W2vWord2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-40.74%)
Mutual labels: spark, pyspark
Pysparkgeoanalysis🌐 Interactive Workshop on GeoAnalysis using PySpark
Stars: ✭ 63 (-41.67%)
Mutual labels: spark, pyspark
data-algorithms-with-sparkO'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
Stars: ✭ 34 (-68.52%)
Mutual labels: spark, pyspark
Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+1138.89%)
Mutual labels: spark, pyspark
spark-extensionA library that provides useful extensions to Apache Spark and PySpark.
Stars: ✭ 25 (-76.85%)
Mutual labels: spark, pyspark
Pyspark Example ProjectExample project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+486.11%)
Mutual labels: spark, pyspark
autThe Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (+2.78%)
Mutual labels: spark, pyspark
Sparkling TitanicTraining models with Apache Spark, PySpark for Titanic Kaggle competition
Stars: ✭ 12 (-88.89%)
Mutual labels: spark, pyspark
data processing courseSome class materials for a data processing course using PySpark
Stars: ✭ 50 (-53.7%)
Mutual labels: spark, pyspark
kafka-compose🎼 Docker compose files for various kafka stacks
Stars: ✭ 32 (-70.37%)
Mutual labels: spark, pyspark
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+812.96%)
Mutual labels: spark, pyspark
Repository个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (-14.81%)
Mutual labels: algorithm, spark