JustenoughscalaforsparkA tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
Stars: ✭ 538 (+247.1%)
GimelBig Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (+39.35%)
Spark PracticeApache Spark (PySpark) Practice on Real Data
Stars: ✭ 200 (+29.03%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-3.23%)
Sparkling TitanicTraining models with Apache Spark, PySpark for Titanic Kaggle competition
Stars: ✭ 12 (-92.26%)
kafka-compose🎼 Docker compose files for various kafka stacks
Stars: ✭ 32 (-79.35%)
HandysparkHandySpark - bringing pandas-like capabilities to Spark dataframes
Stars: ✭ 158 (+1.94%)
LinkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,323 (+1398.71%)
TutorialJava全栈知识架构体系总结
Stars: ✭ 407 (+162.58%)
basinBasin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-83.87%)
Cc PysparkProcess Common Crawl data with Python and Spark
Stars: ✭ 147 (-5.16%)
Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+763.23%)
Spark NlpState of the Art Natural Language Processing
Stars: ✭ 2,518 (+1524.52%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+536.13%)
spark-extensionA library that provides useful extensions to Apache Spark and PySpark.
Stars: ✭ 25 (-83.87%)
ScriptisScriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Stars: ✭ 696 (+349.03%)
Devops Python Tools80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+161.94%)
W2vWord2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-58.71%)
Pyspark Cheatsheet🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (-30.32%)
MmlsparkSimple and Distributed Machine Learning
Stars: ✭ 2,899 (+1770.32%)
SparkmagicJupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+515.48%)
incubator-linkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,459 (+1486.45%)
autThe Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (-28.39%)
Pyspark Example ProjectExample project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+308.39%)
Pysparkgeoanalysis🌐 Interactive Workshop on GeoAnalysis using PySpark
Stars: ✭ 63 (-59.35%)
ODSC India 2018My presentation at ODSC India 2018 about Deep Learning with Apache Spark
Stars: ✭ 26 (-83.23%)
Live log analyzer sparkSpark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.
Stars: ✭ 14 (-90.97%)
HnswlibJava library for approximate nearest neighbors search using Hierarchical Navigable Small World graphs
Stars: ✭ 108 (-30.32%)
AztkAZTK powered by Azure Batch: On-demand, Dockerized, Spark Jobs on Azure
Stars: ✭ 152 (-1.94%)
Nasa Latex DocsAn easy and convenient package to create technical LaTeX documents.
Stars: ✭ 153 (-1.29%)
DatacompyPandas and Spark DataFrame comparison for humans
Stars: ✭ 147 (-5.16%)
Anomaly detection tutoAnomaly detection tutorial on univariate time series with an auto-encoder
Stars: ✭ 144 (-7.1%)
CobratoolboxThe COnstraint-Based Reconstruction and Analysis Toolbox. Documentation:
Stars: ✭ 149 (-3.87%)
Swiftui TutorialsA code example and translation project of SwiftUI. / 一个 SwiftUI 的示例、翻译的教程项目。
Stars: ✭ 1,992 (+1185.16%)
TonTelegram Open Network research group. Telegram: https://t.me/ton_research
Stars: ✭ 146 (-5.81%)
Angular2 AppThis repository is an example application for angular2 tutorial
Stars: ✭ 150 (-3.23%)
Technology Talk汇总java生态圈常用技术框架、开源中间件,系统架构、数据库、大公司架构案例、常用三方类库、项目管理、线上问题排查、个人成长、思考等知识
Stars: ✭ 12,136 (+7729.68%)
Digital video introductionA hands-on introduction to video technology: image, video, codec (av1, vp9, h265) and more (ffmpeg encoding).
Stars: ✭ 12,184 (+7760.65%)
SciblogA blog made with django designed like a scientific paper written in Latex.
Stars: ✭ 145 (-6.45%)
Kubernetes DjangoScalable and resilient Django with Kubernetes.
Stars: ✭ 145 (-6.45%)
Latex Koma TemplateGeneric template for midsize and larger documents based on KOMA script classes.
Stars: ✭ 151 (-2.58%)
Zalo.github.ioA home for knowledge that is hard to find elsewhere
Stars: ✭ 143 (-7.74%)
Scipy con 2019Tutorial Sessions for SciPy Con 2019
Stars: ✭ 142 (-8.39%)
Trojan Tutor.github.iotrojan 教程 自建梯子教程 trojan教程 trojan-gfw 科学上网 代理工具 翻墙 Ubuntu Debian 小白教程 https伪装
Stars: ✭ 150 (-3.23%)
Google2csvGoogle2Csv a simple google scraper that saves the results on a csv/xlsx/jsonl file
Stars: ✭ 145 (-6.45%)