Apache Spark Hands OnEducational notes,Hands on problems w/ solutions for hadoop ecosystem
Stars: ✭ 74 (-92.03%)
Pulsar SparkWhen Apache Pulsar meets Apache Spark
Stars: ✭ 55 (-94.08%)
Repository个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (-90.1%)
CuesheetA framework for writing Spark 2.x applications in a pretty way
Stars: ✭ 86 (-90.74%)
Spark On K8s OperatorKubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Stars: ✭ 1,780 (+91.6%)
SparktutorialSource code for James Lee's Aparch Spark with Java course
Stars: ✭ 105 (-88.7%)
Lambda ArchApplying Lambda Architecture with Spark, Kafka, and Cassandra.
Stars: ✭ 111 (-88.05%)
TeddySpark Streaming监控平台,支持任务部署与告警、自启动
Stars: ✭ 120 (-87.08%)
Kinesis SqlKinesis Connector for Structured Streaming
Stars: ✭ 120 (-87.08%)
Kafka Storm StarterCode examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
Stars: ✭ 728 (-21.64%)
MmlsparkSimple and Distributed Machine Learning
Stars: ✭ 2,899 (+212.06%)
Spark Jupyter AwsA guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Stars: ✭ 259 (-72.12%)
GimelBig Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (-76.75%)
connected-componentMap Reduce Implementation of Connected Component on Apache Spark
Stars: ✭ 68 (-92.68%)
Docker Spark ClusterA simple spark standalone cluster for your testing environment purposses
Stars: ✭ 261 (-71.91%)
Spark NotebookInteractive and Reactive Data Science using Scala and Spark.
Stars: ✭ 3,081 (+231.65%)
TedsdsApache Spark - Turbofan Engine Degradation Simulation Data Set example in Apache Spark
Stars: ✭ 14 (-98.49%)
Every Single Day I TldrA daily digest of the articles or videos I've found interesting, that I want to share with you.
Stars: ✭ 249 (-73.2%)
BallistaDistributed compute platform implemented in Rust, and powered by Apache Arrow.
Stars: ✭ 2,274 (+144.78%)
GeniA Clojure dataframe library that runs on Spark
Stars: ✭ 152 (-83.64%)
SparkmeasureThis is the development repository of SparkMeasure, a tool for performance troubleshooting of Apache Spark workloads. It simplifies the collection and analysis of Spark task metrics data.
Stars: ✭ 368 (-60.39%)
pyspark-algorithmsPySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
Stars: ✭ 72 (-92.25%)
lectures-hse-sparkМасштабируемое машинное обучение и анализ больших данных с Apache Spark
Stars: ✭ 20 (-97.85%)
learning-hadoop-and-sparkCompanion to Learning Hadoop and Learning Spark courses on Linked In Learning
Stars: ✭ 146 (-84.28%)
Agile data code 2Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (-55.54%)
spark-utilsBasic framework utilities to quickly start writing production ready Apache Spark applications
Stars: ✭ 25 (-97.31%)
big dataA collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (-96.34%)
yuzhouwanCode Library for My Blog
Stars: ✭ 39 (-95.8%)
VaexOut-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualize and explore big tabular data at a billion rows per second 🚀
Stars: ✭ 6,793 (+631.22%)
Big Data Rosetta CodeCode snippets for solving common big data problems in various platforms. Inspired by Rosetta Code
Stars: ✭ 254 (-72.66%)
SparkflowEasy to use library to bring Tensorflow on Apache Spark
Stars: ✭ 282 (-69.64%)
isarn-sketches-sparkRoutines and data structures for using isarn-sketches idiomatically in Apache Spark
Stars: ✭ 28 (-96.99%)
SidekickHigh Performance HTTP Sidecar Load Balancer
Stars: ✭ 366 (-60.6%)
WirbelsturmWirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data tech like Kafka.
Stars: ✭ 332 (-64.26%)
SparkleHaskell on Apache Spark.
Stars: ✭ 419 (-54.9%)
God Of Bigdata专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+546.72%)
Bigdataguide大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (-12.06%)
IoDataset, streaming, and file system extensions maintained by TensorFlow SIG-IO
Stars: ✭ 427 (-54.04%)
LearningsparkScala examples for learning to use Spark
Stars: ✭ 421 (-54.68%)
Bigdataie大数据博客、笔试题、教程、项目、面经的整理
Stars: ✭ 445 (-52.1%)
Learningsparkv2This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Stars: ✭ 307 (-66.95%)
Bdp Dataplatform大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Stars: ✭ 456 (-50.91%)
BigsliceA serverless cluster computing system for the Go programming language
Stars: ✭ 469 (-49.52%)
Spark TdaSparkTDA is a package for Apache Spark providing Topological Data Analysis Functionalities.
Stars: ✭ 45 (-95.16%)