Azure Event Hubs SparkEnabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (+508.7%)
Kafka Storm StarterCode examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
Stars: ✭ 728 (+3065.22%)
FlintrockA command-line tool for launching Apache Spark clusters.
Stars: ✭ 568 (+2369.57%)
OpenscoringREST web service for the true real-time scoring (<1 ms) of Scikit-Learn, R and Apache Spark models
Stars: ✭ 536 (+2230.43%)
Agile data code 2Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+1695.65%)
Griffon VmGriffon Data Science Virtual Machine
Stars: ✭ 128 (+456.52%)
Spark On K8s OperatorKubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Stars: ✭ 1,780 (+7639.13%)
Coolplayspark酷玩 Spark: Spark 源代码解析、Spark 类库等
Stars: ✭ 3,318 (+14326.09%)
kdtreeA pure Nim k-d tree implementation for efficient spatial querying of point data
Stars: ✭ 40 (+73.91%)
Docker SparkApache Spark docker image
Stars: ✭ 1,396 (+5969.57%)
SparktorchTrain and run Pytorch models on Apache Spark.
Stars: ✭ 195 (+747.83%)
spark-utilsBasic framework utilities to quickly start writing production ready Apache Spark applications
Stars: ✭ 25 (+8.7%)
SparkflowEasy to use library to bring Tensorflow on Apache Spark
Stars: ✭ 282 (+1126.09%)
CuesheetA framework for writing Spark 2.x applications in a pretty way
Stars: ✭ 86 (+273.91%)
spinmobRapid and flexible acquisition, analysis, fitting, and plotting in Python. Designed for scientific laboratories.
Stars: ✭ 34 (+47.83%)
MlflowOpen source platform for the machine learning lifecycle
Stars: ✭ 10,898 (+47282.61%)
parquet-dotnet🐬 Apache Parquet for modern .Net
Stars: ✭ 199 (+765.22%)
PysparklingA pure Python implementation of Apache Spark's RDD and DStream interfaces.
Stars: ✭ 231 (+904.35%)
AlbedoA recommender system for discovering GitHub repos, built with Apache Spark
Stars: ✭ 149 (+547.83%)
DaFlowApache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (+4.35%)
Pulsar SparkWhen Apache Pulsar meets Apache Spark
Stars: ✭ 55 (+139.13%)
leaflet heatmap简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-43.48%)
spark-connectorA connector for Apache Spark to access Exasol
Stars: ✭ 13 (-43.48%)
spark-transformersSpark-Transformers: Library for exporting Apache Spark MLLIB models to use them in any Java application with no other dependencies.
Stars: ✭ 39 (+69.57%)
Spark Atlas ConnectorA Spark Atlas connector to track data lineage in Apache Atlas
Stars: ✭ 160 (+595.65%)
Spark Sklearn(Deprecated) Scikit-learn integration package for Apache Spark
Stars: ✭ 1,055 (+4486.96%)
hyperdriveExtensible streaming ingestion pipeline on top of Apache Spark
Stars: ✭ 31 (+34.78%)
sparkApache Spark enhanced with native Kubernetes scheduler back-end: NOTE this repository is being ARCHIVED as all new development for the kubernetes scheduler back-end is now on https://github.com/apache/spark/
Stars: ✭ 609 (+2547.83%)
sparkucxA high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer
Stars: ✭ 32 (+39.13%)
Spark TdaSparkTDA is a package for Apache Spark providing Topological Data Analysis Functionalities.
Stars: ✭ 45 (+95.65%)
SparkTwitterAnalysisAn Apache Spark standalone application using the Spark API in Scala. The application uses Simple Build Tool(SBT) for building the project.
Stars: ✭ 29 (+26.09%)
DblinkDistributed Bayesian Entity Resolution in Apache Spark
Stars: ✭ 38 (+65.22%)
BigCLAM-ApacheSparkOverlapping community detection in Large-Scale Networks using BigCLAM model build on Apache Spark
Stars: ✭ 40 (+73.91%)
cloud-integrationSpark cloud integration: tests, cloud committers and more
Stars: ✭ 20 (-13.04%)
ParquetviewerSimple windows desktop application for viewing & querying Apache Parquet files
Stars: ✭ 145 (+530.43%)
spark-recordsBulletproof Apache Spark jobs with fast root cause analysis of failures.
Stars: ✭ 67 (+191.3%)
OryxOryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
Stars: ✭ 1,785 (+7660.87%)
Spark FlamegraphEasy CPU Profiling for Apache Spark applications
Stars: ✭ 30 (+30.43%)
Datahacksummit 2017Apache Zeppelin notebooks for Recommendation Engines using Keras and Machine Learning on Apache Spark
Stars: ✭ 30 (+30.43%)
HydrographA visual ETL development and debugging tool for big data
Stars: ✭ 144 (+526.09%)
optimus🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Stars: ✭ 1,351 (+5773.91%)