gazelle pluginNative SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.
StoreItemDemand(117th place - Top 26%) Deep learning using Keras and Spark for the "Store Item Demand Forecasting" Kaggle competition.
jgit-spark-connectorjgit-spark-connector is a library for running scalable data retrieval pipelines that process any number of Git repositories for source code analysis.
GenomicsDBHighly performant data storage in C++ for importing, querying and transforming variant data with C/C++/Java/Spark bindings. Used in gatk4.
jwxJSON/JWK/JWS/JWT/Base64 library in SPARK
mleapR Interface to MLeap
isarn-sketches-sparkRoutines and data structures for using isarn-sketches idiomatically in Apache Spark
pipelinePipelineAI Kubeflow Distribution
pyspark-cassandrapyspark-cassandra is a Python port of the awesome @datastax Spark Cassandra connector. Compatible w/ Spark 2.0, 2.1, 2.2, 2.3 and 2.4
ExDeMonA general purpose metrics monitor implemented with Apache Spark. Kafka source, Elastic sink, aggregate metrics, different analysis, notifications, actions, live configuration update, missing metrics, ...
optimus🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
popmonMonitor the stability of a Pandas or Spark dataframe ⚙︎
mizoSuper-fast Spark RDD for Titan Graph Database on HBase
workshop-sparkCódigo para workshops Spark com ambiente de desenvolvimento em docker
mascMicrosoft's contributions for Spark with Apache Accumulo
spark-kiosk-notifyAdds a notification panel to your Laravel Spark Kiosk, allowing you to send notifications to users.