Pulsar SparkWhen Apache Pulsar meets Apache Spark
Stars: ✭ 55 (+223.53%)
flumeA blazing fast job processing system backed by GenStage & Redis.
Stars: ✭ 37 (+117.65%)
Analytics ZooDistributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray
Stars: ✭ 2,448 (+14300%)
AlbedoA recommender system for discovering GitHub repos, built with Apache Spark
Stars: ✭ 149 (+776.47%)
Spark WorkshopApache Spark™ and Scala Workshops
Stars: ✭ 224 (+1217.65%)
fab-oidcFlask-AppBuilder SecurityManager for OpenIDConnect
Stars: ✭ 28 (+64.71%)
Whylogs JavaProfile and monitor your ML data pipeline end-to-end
Stars: ✭ 164 (+864.71%)
streamsx.kafkaRepository for integration with Apache Kafka
Stars: ✭ 13 (-23.53%)
openmrs-fhir-analyticsA collection of tools for extracting FHIR resources and analytics services on top of that data.
Stars: ✭ 55 (+223.53%)
Scalable Data ScienceScalable Data Science, course sets in big data Using Apache Spark over databricks and their mathematical, statistical and computational foundations using SageMath.
Stars: ✭ 142 (+735.29%)
Spark.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Stars: ✭ 1,721 (+10023.53%)
PysparklingA pure Python implementation of Apache Spark's RDD and DStream interfaces.
Stars: ✭ 231 (+1258.82%)
SparkrdmaRDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (+1164.71%)
corb2MarkLogic tool for processing and reporting on content, enhanced from the original CoRB
Stars: ✭ 18 (+5.88%)
Bigdata PlaygroundA complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (+941.18%)
data-product-analyticsTemplate to deploy a Data Product for analytics and data science use-cases into a Data Landing Zone of the Data Management & Analytics Scenario (former Enterprise-Scale Analytics). The Data Product template can be used by cross-functional teams to create insights and products for external users.
Stars: ✭ 62 (+264.71%)
generic-batch-processor”Building a concurrent and distributed system for batch processing which is fault tolerant and can scale up or scale out using Akka.NET (based on actor model)”.
Stars: ✭ 18 (+5.88%)
OryxOryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
Stars: ✭ 1,785 (+10400%)
isarn-sketches-sparkRoutines and data structures for using isarn-sketches idiomatically in Apache Spark
Stars: ✭ 28 (+64.71%)
spark-connectorA connector for Apache Spark to access Exasol
Stars: ✭ 13 (-23.53%)
SplashSplash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange
Stars: ✭ 105 (+517.65%)
data-landing-zoneTemplate to deploy a single Data Landing Zone of the Data Management & Analytics Scenario (former Enterprise-Scale Analytics). The Data Landing Zone is a logical construct and a unit of scale in the architecture that enables data retention and execution of data workloads for generating insights and value with data.
Stars: ✭ 136 (+700%)
Quinnpyspark methods to enhance developer productivity 📣 👯 🎉
Stars: ✭ 217 (+1176.47%)
fink-brokerAstronomy Broker based on Apache Spark
Stars: ✭ 18 (+5.88%)
daily-homedailyhome - open home automation platform powered by openfaas targeted easy adaptation
Stars: ✭ 28 (+64.71%)
SparktorchTrain and run Pytorch models on Apache Spark.
Stars: ✭ 195 (+1047.06%)
micrOSmicrOS - mini automation OS for DIY projects requires reliable direct communication
Stars: ✭ 55 (+223.53%)
python-batch-runnerA tiny framework for building batch applications as a collection of tasks in a workflow.
Stars: ✭ 22 (+29.41%)
Spark Atlas ConnectorA Spark Atlas connector to track data lineage in Apache Atlas
Stars: ✭ 160 (+841.18%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+782.35%)
ElasticBatchElasticsearch tool for easily collecting and batch inserting Python data and pandas DataFrames
Stars: ✭ 21 (+23.53%)
ParquetviewerSimple windows desktop application for viewing & querying Apache Parquet files
Stars: ✭ 145 (+752.94%)
data-management-zoneTemplate to deploy the Data Management Zone of Cloud Scale Analytics (former Enterprise-Scale Analytics). The Data Management Zone provides data governance and management capabilities for the data platform of an organization.
Stars: ✭ 142 (+735.29%)
HydrographA visual ETL development and debugging tool for big data
Stars: ✭ 144 (+747.06%)
Azure Event Hubs SparkEnabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (+723.53%)
spark3DSpark extension for processing large-scale 3D data sets: Astrophysics, High Energy Physics, Meteorology, …
Stars: ✭ 23 (+35.29%)
rack-cargo🚚 Batch requests for Rack apps (works with Rails, Sinatra, etc)
Stars: ✭ 17 (+0%)
Griffon VmGriffon Data Science Virtual Machine
Stars: ✭ 128 (+652.94%)
Neon.HomeControlHome Automation System, similar to HomeAssistant but made with .net core and ❤️
Stars: ✭ 46 (+170.59%)
Spark On K8s OperatorKubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Stars: ✭ 1,780 (+10370.59%)
mmtf-sparkMethods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.
Stars: ✭ 20 (+17.65%)
Docker SparkApache Spark docker image
Stars: ✭ 1,396 (+8111.76%)
awesome-toolscurated list of awesome tools and libraries for specific domains
Stars: ✭ 31 (+82.35%)
treenetRecursive Neural Networks for PyTorch
Stars: ✭ 29 (+70.59%)
goroutinesIt is an efficient, flexible, and lightweight goroutine pool. It provides an easy way to deal with concurrent tasks with limited resource.
Stars: ✭ 88 (+417.65%)
flink-deployerA tool that help automate deployment to an Apache Flink cluster
Stars: ✭ 143 (+741.18%)