Scalable Data Science, course sets in big data Using Apache Spark over databricks and their mathematical, statistical and computational foundations using SageMath.

Stars: ✭ 142 (+735.29%)

Mutual labels: apache-spark

Spark

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

Stars: ✭ 1,721 (+10023.53%)

Mutual labels: apache-spark

Pysparkling

A pure Python implementation of Apache Spark's RDD and DStream interfaces.

Stars: ✭ 231 (+1258.82%)

Mutual labels: apache-spark

aws-batch-example

Example use of AWS batch

Stars: ✭ 96 (+464.71%)

Mutual labels: batch-processing

Sparkrdma

RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark

Stars: ✭ 215 (+1164.71%)

Mutual labels: apache-spark

corb2

MarkLogic tool for processing and reporting on content, enhanced from the original CoRB

Stars: ✭ 18 (+5.88%)

Mutual labels: batch-processing

Bigdata Playground

A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL

Stars: ✭ 177 (+941.18%)

Mutual labels: apache-spark

data-product-analytics

Template to deploy a Data Product for analytics and data science use-cases into a Data Landing Zone of the Data Management & Analytics Scenario (former Enterprise-Scale Analytics). The Data Product template can be used by cross-functional teams to create insights and products for external users.

Stars: ✭ 62 (+264.71%)

Mutual labels: data-mesh

Cheatsheets.pdf

📚 Various cheatsheets in PDF

Stars: ✭ 159 (+835.29%)

Mutual labels: apache-spark

generic-batch-processor

”Building a concurrent and distributed system for batch processing which is fault tolerant and can scale up or scale out using Akka.NET (based on actor model)”.

Stars: ✭ 18 (+5.88%)

Mutual labels: batch-processing

Oryx

Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning

Stars: ✭ 1,785 (+10400%)

Mutual labels: apache-spark

isarn-sketches-spark

Routines and data structures for using isarn-sketches idiomatically in Apache Spark

Stars: ✭ 28 (+64.71%)

Mutual labels: apache-spark

Spark On Lambda

Apache Spark on AWS Lambda

Stars: ✭ 137 (+705.88%)

Mutual labels: apache-spark

Location-based-Restaurants-Recommendation-System

Big Data Management and Analysis Final Project

Stars: ✭ 44 (+158.82%)

Mutual labels: apache-spark

spark-connector

A connector for Apache Spark to access Exasol

Stars: ✭ 13 (-23.53%)

Mutual labels: apache-spark

Scala Spark Tutorial

Project for James' Apache Spark with Scala course

Stars: ✭ 121 (+611.76%)

Mutual labels: apache-spark

Splash

Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange

Stars: ✭ 105 (+517.65%)

Mutual labels: apache-spark

Mastering Spark Sql Book

The Internals of Spark SQL

Stars: ✭ 234 (+1276.47%)

Mutual labels: apache-spark

data-landing-zone

Template to deploy a single Data Landing Zone of the Data Management & Analytics Scenario (former Enterprise-Scale Analytics). The Data Landing Zone is a logical construct and a unit of scale in the architecture that enables data retention and execution of data workloads for generating insights and value with data.

Stars: ✭ 136 (+700%)

Mutual labels: data-mesh

Awesome Ai Infrastructures

Infrastructures™ for Machine Learning Training/Inference in Production.

Stars: ✭ 223 (+1211.76%)

Mutual labels: apache-spark

spark-twitter-sentiment-analysis

Sentiment Analysis of a Twitter Topic with Spark Structured Streaming

Stars: ✭ 55 (+223.53%)

Mutual labels: apache-spark

Quinn

pyspark methods to enhance developer productivity 📣 👯 🎉

Stars: ✭ 217 (+1176.47%)

Mutual labels: apache-spark

fink-broker

Astronomy Broker based on Apache Spark

Stars: ✭ 18 (+5.88%)

Mutual labels: apache-spark

Learning Apache Spark

Notes on Apache Spark (pyspark)

Stars: ✭ 211 (+1141.18%)

Mutual labels: apache-spark

daily-home

dailyhome - open home automation platform powered by openfaas targeted easy adaptation

Stars: ✭ 28 (+64.71%)

Mutual labels: iot-platform

Sparktorch

Train and run Pytorch models on Apache Spark.

Stars: ✭ 195 (+1047.06%)

Mutual labels: apache-spark

micrOS

micrOS - mini automation OS for DIY projects requires reliable direct communication

Stars: ✭ 55 (+223.53%)

Mutual labels: iot-platform

Azure Cosmosdb Spark

Apache Spark Connector for Azure Cosmos DB

Stars: ✭ 165 (+870.59%)

Mutual labels: apache-spark

python-batch-runner

A tiny framework for building batch applications as a collection of tasks in a workflow.

Stars: ✭ 22 (+29.41%)

Mutual labels: batch-processing

Spark Atlas Connector

A Spark Atlas connector to track data lineage in Apache Atlas

Stars: ✭ 160 (+841.18%)

Mutual labels: apache-spark

spring-batch-rest

REST API for Spring Batch using Spring Boot 2.2

Stars: ✭ 85 (+400%)

Mutual labels: batch-processing

Spark With Python

Fundamentals of Spark with Python (using PySpark), code examples

Stars: ✭ 150 (+782.35%)

Mutual labels: apache-spark

ElasticBatch

Elasticsearch tool for easily collecting and batch inserting Python data and pandas DataFrames

Stars: ✭ 21 (+23.53%)

Mutual labels: batch-processing

Parquetviewer

Simple windows desktop application for viewing & querying Apache Parquet files

Stars: ✭ 145 (+752.94%)

Mutual labels: apache-spark

data-management-zone

Template to deploy the Data Management Zone of Cloud Scale Analytics (former Enterprise-Scale Analytics). The Data Management Zone provides data governance and management capabilities for the data platform of an organization.

Stars: ✭ 142 (+735.29%)

Mutual labels: data-mesh

Hydrograph

A visual ETL development and debugging tool for big data

Stars: ✭ 144 (+747.06%)

Mutual labels: apache-spark

learn-by-examples

Real-world Spark pipelines examples

Stars: ✭ 84 (+394.12%)

Mutual labels: apache-spark

Azure Event Hubs Spark

Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs