50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak, TeamCity and DevOps tools built on the major Linux distros: Alpine, CentOS, Debian, Fedora, Ubuntu

Stars: ✭ 847 (+1288.52%)

Mutual labels: spark

Optimus

🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark

Stars: ✭ 986 (+1516.39%)

Mutual labels: spark

Silex

Silex is a static website builder in the cloud.

Stars: ✭ 958 (+1470.49%)

Mutual labels: silex

Kylo

Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.

Stars: ✭ 916 (+1401.64%)

Mutual labels: spark

Awesome Recommendation Engine

The purpose of this tiny project is to put things together with the know how that i learned from the course big data expert from formacionhadoop.com The idea is to show how to play with apache spark streaming, kafka,mongo, spark machine learning algorithms.

Stars: ✭ 47 (-22.95%)

Mutual labels: spark

Data Algorithms Book

MapReduce, Spark, Java, and Scala for Data Algorithms Book

Stars: ✭ 949 (+1455.74%)

Mutual labels: spark

Net.jgp.labs.spark

Apache Spark examples exclusively in Java

Stars: ✭ 55 (-9.84%)

Mutual labels: spark

Flint

A Time Series Library for Apache Spark

Stars: ✭ 878 (+1339.34%)

Mutual labels: spark

Spark Examples

Spark examples

Stars: ✭ 41 (-32.79%)

Mutual labels: spark

Sparkling Titanic

Training models with Apache Spark, PySpark for Titanic Kaggle competition

Stars: ✭ 12 (-80.33%)

Mutual labels: spark

Rumble

⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more

Stars: ✭ 58 (-4.92%)

Mutual labels: spark

Sparkjni

A heterogeneous Apache Spark framework.

Stars: ✭ 11 (-81.97%)

Mutual labels: spark

Data Ingestion Platform

Stars: ✭ 39 (-36.07%)

Mutual labels: spark

Spark Swagger

Spark (http://sparkjava.com/) support for Swagger (https://swagger.io/)

Stars: ✭ 25 (-59.02%)

Mutual labels: spark

Spark Submit Ui

This is a based on playframwork for submit spark app

Stars: ✭ 53 (-13.11%)

Mutual labels: spark

Spark Tdd Example

A simple Spark TDD example

Stars: ✭ 23 (-62.3%)

Mutual labels: spark

Weblogsanalysissystem

A big data platform for analyzing web access logs

Stars: ✭ 37 (-39.34%)

Mutual labels: spark

Spark Summit East 2017

Stars: ✭ 33 (-45.9%)

Mutual labels: spark

Spark Scala Tutorial

A free tutorial for Apache Spark.

Stars: ✭ 907 (+1386.89%)

Mutual labels: spark

Apache Spark Internals

The Internals of Apache Spark

Stars: ✭ 1,045 (+1613.11%)

Mutual labels: spark

Spark Flamegraph

Easy CPU Profiling for Apache Spark applications

Stars: ✭ 30 (-50.82%)

Mutual labels: spark

Awesome Pulsar

A curated list of Pulsar tools, integrations and resources.

Stars: ✭ 57 (-6.56%)

Mutual labels: spark

Pucket

Bucketing and partitioning system for Parquet

Stars: ✭ 29 (-52.46%)

Mutual labels: spark

Spark As Service Using Embedded Server

This application comes as Spark2.1-as-Service-Provider using an embedded, Reactive-Streams-based, fully asynchronous HTTP server

Stars: ✭ 46 (-24.59%)

Mutual labels: spark

Heracles

High performance HBase / Spark SQL engine

Stars: ✭ 27 (-55.74%)

Mutual labels: spark

Pyspark Examples

Code examples on Apache Spark using python

Stars: ✭ 58 (-4.92%)

Mutual labels: spark

Interview Questions Collection

按知识领域整理面试题，包括C++、Java、Hadoop、机器学习等

Stars: ✭ 21 (-65.57%)

Mutual labels: spark

Delta Architecture

Streaming data changes to a Data Lake with Debezium and Delta Lake pipeline

Stars: ✭ 43 (-29.51%)

Mutual labels: spark

Tedsds

Apache Spark - Turbofan Engine Degradation Simulation Data Set example in Apache Spark

Stars: ✭ 14 (-77.05%)

Mutual labels: spark

Pulsar Spark

When Apache Pulsar meets Apache Spark

Stars: ✭ 55 (-9.84%)

Mutual labels: spark

Urhox

Urho3D extension library

Stars: ✭ 13 (-78.69%)

Mutual labels: spark

Gatk

Official code repository for GATK versions 4 and up

Stars: ✭ 1,002 (+1542.62%)

Mutual labels: spark

Mlfeature

Feature engineering toolkit for Spark MLlib.

Stars: ✭ 12 (-80.33%)

Mutual labels: spark

Data Science Cookbook

🎓 Jupyter notebooks from UFC data science course

Stars: ✭ 60 (-1.64%)

Mutual labels: spark

Mare

MaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.

Stars: ✭ 11 (-81.97%)

Mutual labels: spark

Pixiedust

Python Helper library for Jupyter Notebooks

Stars: ✭ 998 (+1536.07%)

Mutual labels: spark

Bigdata Interview

🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结

Stars: ✭ 857 (+1304.92%)

Mutual labels: spark

Utils4s

scala、spark使用过程中，各种测试用例以及相关资料整理

Stars: ✭ 1,070 (+1654.1%)

Mutual labels: spark

Tiledb Vcf

Efficient variant-call data storage and retrieval library using the TileDB storage library.

Stars: ✭ 26 (-57.38%)

Mutual labels: spark

Snappydata

Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in one cluster

Stars: ✭ 995 (+1531.15%)

Mutual labels: spark

Mobius

C# and F# language binding and extensions to Apache Spark

Stars: ✭ 929 (+1422.95%)

Mutual labels: spark

Docker Spark Cluster

A Spark cluster setup running on Docker containers

Stars: ✭ 57 (-6.56%)

Mutual labels: spark

Cache Service Provider

A Cache Service Provider for Silex, using the doctrine/cache package

Stars: ✭ 23 (-62.3%)

Mutual labels: silex

Real Time Stream Processing Engine

This is an example of real time stream processing using Spark Streaming, Kafka & Elasticsearch.

Stars: ✭ 37 (-39.34%)

Mutual labels: spark

Digitrecognizer

Java Convolutional Neural Network example for Hand Writing Digit Recognition

Stars: ✭ 23 (-62.3%)

Mutual labels: spark

Play Spark Scala

Stars: ✭ 51 (-16.39%)

Mutual labels: spark

Learning Spark

零基础学习spark，大数据学习

Stars: ✭ 37 (-39.34%)

Mutual labels: spark

Waimak

Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.

Stars: ✭ 60 (-1.64%)

Mutual labels: spark

Zemberek Nlp Server

Zemberek Türkçe NLP Java Kütüphanesi üzerine REST Docker Sunucu

Stars: ✭ 60 (-1.64%)

Mutual labels: spark

Model Serving Tutorial

Code and presentation for Strata Model Serving tutorial

Stars: ✭ 57 (-6.56%)

Mutual labels: spark

Spark Nkp

Natural Korean Processor for Apache Spark

Stars: ✭ 50 (-18.03%)

Mutual labels: spark

Vagrant Projects

Vagrant projects for various use-cases with Spark, Zeppelin, IPython / Jupyter, SparkR

Stars: ✭ 34 (-44.26%)

Mutual labels: spark

1-60 of 423 similar projects

›

next*5