50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak, TeamCity and DevOps tools built on the major Linux distros: Alpine, CentOS, Debian, Fedora, Ubuntu

Stars: ✭ 847 (+1440%)

Mutual labels: spark

Learning Spark

零基础学习spark，大数据学习

Stars: ✭ 37 (-32.73%)

Mutual labels: spark

Spark Swagger

Spark (http://sparkjava.com/) support for Swagger (https://swagger.io/)

Stars: ✭ 25 (-54.55%)

Mutual labels: spark

Chronicler

Scala toolchain for InfluxDB

Stars: ✭ 24 (-56.36%)

Mutual labels: spark

Play Spark Scala

Stars: ✭ 51 (-7.27%)

Mutual labels: spark

Delta Architecture

Streaming data changes to a Data Lake with Debezium and Delta Lake pipeline

Stars: ✭ 43 (-21.82%)

Mutual labels: spark

Vagrant Projects

Vagrant projects for various use-cases with Spark, Zeppelin, IPython / Jupyter, SparkR

Stars: ✭ 34 (-38.18%)

Mutual labels: spark

Digitrecognizer

Java Convolutional Neural Network example for Hand Writing Digit Recognition

Stars: ✭ 23 (-58.18%)

Mutual labels: spark

Kylo

Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.

Stars: ✭ 916 (+1565.45%)

Mutual labels: spark

Spark Flamegraph

Easy CPU Profiling for Apache Spark applications

Stars: ✭ 30 (-45.45%)

Mutual labels: spark

Foxcross

AsyncIO serving for data science models

Stars: ✭ 18 (-67.27%)

Mutual labels: dataframe

Flint

A Time Series Library for Apache Spark

Stars: ✭ 878 (+1496.36%)

Mutual labels: spark

Data Ingestion Platform

Stars: ✭ 39 (-29.09%)

Mutual labels: spark

Live log analyzer spark

Spark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.

Stars: ✭ 14 (-74.55%)

Mutual labels: spark

Awesome Recommendation Engine

The purpose of this tiny project is to put things together with the know how that i learned from the course big data expert from formacionhadoop.com The idea is to show how to play with apache spark streaming, kafka,mongo, spark machine learning algorithms.

Stars: ✭ 47 (-14.55%)

Mutual labels: spark

Sparkling Titanic

Training models with Apache Spark, PySpark for Titanic Kaggle competition

Stars: ✭ 12 (-78.18%)

Mutual labels: spark

Optimus

🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark

Stars: ✭ 986 (+1692.73%)

Mutual labels: spark

Mare

MaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.

Stars: ✭ 11 (-80%)

Mutual labels: spark

Spark Submit Ui

This is a based on playframwork for submit spark app

Stars: ✭ 53 (-3.64%)

Mutual labels: spark

Bigdata Interview

🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结

Stars: ✭ 857 (+1458.18%)

Mutual labels: spark

Weblogsanalysissystem

A big data platform for analyzing web access logs

Stars: ✭ 37 (-32.73%)

Mutual labels: spark

Tiledb Vcf

Efficient variant-call data storage and retrieval library using the TileDB storage library.

Stars: ✭ 26 (-52.73%)

Mutual labels: spark

Spark Tda

SparkTDA is a package for Apache Spark providing Topological Data Analysis Functionalities.

Stars: ✭ 45 (-18.18%)

Mutual labels: spark

Pandas Ta

Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 130+ Indicators

Stars: ✭ 962 (+1649.09%)

Mutual labels: dataframe

Spark Tdd Example

A simple Spark TDD example

Stars: ✭ 23 (-58.18%)

Mutual labels: spark

Docker Hadoop

A Docker container with a full Hadoop cluster setup with Spark and Zeppelin

Stars: ✭ 54 (-1.82%)

Mutual labels: spark

Boltzmannclean

Fill missing values in Pandas DataFrames using Restricted Boltzmann Machines

Stars: ✭ 23 (-58.18%)

Mutual labels: dataframe

Spark Summit East 2017

Stars: ✭ 33 (-40%)

Mutual labels: spark

Spark Scala Tutorial

A free tutorial for Apache Spark.

Stars: ✭ 907 (+1549.09%)

Mutual labels: spark

Spark Examples

Spark examples

Stars: ✭ 41 (-25.45%)

Mutual labels: spark

Sparkmagic

Jupyter magics and kernels for working with remote Spark clusters

Stars: ✭ 954 (+1634.55%)

Mutual labels: spark

Yandex Big Data Engineering

Stars: ✭ 17 (-69.09%)

Mutual labels: spark

Parquet Generator

Parquet file generator

Stars: ✭ 16 (-70.91%)

Mutual labels: spark

Big Data Scala Spark

Coursera's big data course with Scala and Spark

Stars: ✭ 16 (-70.91%)

Mutual labels: spark

Spark Nkp

Natural Korean Processor for Apache Spark

Stars: ✭ 50 (-9.09%)

Mutual labels: spark

Gatk

Official code repository for GATK versions 4 and up

Stars: ✭ 1,002 (+1721.82%)

Mutual labels: spark

Pucket

Bucketing and partitioning system for Parquet

Stars: ✭ 29 (-47.27%)

Mutual labels: spark

Sparkling Water

Sparkling Water provides H2O functionality inside Spark cluster

Stars: ✭ 887 (+1512.73%)

Mutual labels: spark

Dataframe

C++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types, continuous memory storage, and no pointers are involved

Stars: ✭ 828 (+1405.45%)

Mutual labels: dataframe

Data Algorithms Book

MapReduce, Spark, Java, and Scala for Data Algorithms Book

Stars: ✭ 949 (+1625.45%)

Mutual labels: spark

Szt Bigdata

深圳地铁大数据客流分析系统🚇🚄🌟

Stars: ✭ 826 (+1401.82%)

Mutual labels: spark

Bigdataguide

大数据学习，从零开始学习大数据，包含大数据学习各阶段学习视频、面试资料