All Projects → Dpark → Similar Projects or Alternatives

683 Open source projects that are alternatives of or similar to Dpark

Scalable Data Science Platform

Content for architecting a data science platform for products using Luigi, Spark & Flask.

Stars: ✭ 158 (-94.08%)

Mutual labels: spark

Spark

Apache Spark - A unified analytics engine for large-scale data processing

Stars: ✭ 31,618 (+1085.08%)

Mutual labels: spark

Cube.js

📊 Cube — Open-Source Analytics API for Building Data Apps

Stars: ✭ 11,983 (+349.14%)

Mutual labels: spark

Aws Auto Terminate Idle Emr

AWS Auto Terminate Idle AWS EMR Clusters Framework is an AWS based solution using AWS CloudWatch and AWS Lambda using a Python script that is using Boto3 to terminate AWS EMR clusters that have been idle for a specified period of time.

Stars: ✭ 21 (-99.21%)

Mutual labels: bigdata

Flink Sql Cookbook

The Apache Flink SQL Cookbook is a curated collection of examples, patterns, and use cases of Apache Flink SQL. Many of the recipes are completely self-contained and can be run in Ververica Platform as is.

Stars: ✭ 189 (-92.92%)

Mutual labels: stream-processing

Spark Streaming Monitoring With Lightning

Plot live-stats as graph from ApacheSpark application using Lightning-viz

Stars: ✭ 15 (-99.44%)

Mutual labels: bigdata

Spark Lucenerdd

Spark RDD with Lucene's query and entity linkage capabilities

Stars: ✭ 114 (-95.73%)

Mutual labels: spark

Flint

A Time Series Library for Apache Spark

Stars: ✭ 878 (-67.09%)

Mutual labels: spark

Geni

A Clojure dataframe library that runs on Spark

Stars: ✭ 152 (-94.3%)

Mutual labels: spark

Live log analyzer spark

Spark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.

Stars: ✭ 14 (-99.48%)

Mutual labels: spark

Spark Mllib Twitter Sentiment Analysis

🌟 ✨ Analyze and visualize Twitter Sentiment on a world map using Spark MLlib

Stars: ✭ 113 (-95.76%)

Mutual labels: spark

Sparkling Titanic

Training models with Apache Spark, PySpark for Titanic Kaggle competition

Stars: ✭ 12 (-99.55%)

Mutual labels: spark

Watermill

Building event-driven applications the easy way in Go.

Stars: ✭ 3,504 (+31.33%)

Mutual labels: stream-processing

Liteflow

liteflow是一个基于任务版本来实现的分布式任务流调度系统

Stars: ✭ 112 (-95.8%)

Mutual labels: bigdata

Sparkjni

A heterogeneous Apache Spark framework.

Stars: ✭ 11 (-99.59%)

Mutual labels: spark

Nmflibrary

MATLAB library for non-negative matrix factorization (NMF): Version 1.8.1

Stars: ✭ 153 (-94.27%)

Mutual labels: bigdata

Genie

Distributed Big Data Orchestration Service

Stars: ✭ 1,544 (-42.13%)

Mutual labels: bigdata

Hazelcast Jet

Distributed Stream and Batch Processing

Stars: ✭ 855 (-67.95%)

Mutual labels: stream-processing

Scanns

A scalable nearest neighbor search library in Apache Spark

Stars: ✭ 190 (-92.88%)

Mutual labels: spark

Dockerfiles

50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak, TeamCity and DevOps tools built on the major Linux distros: Alpine, CentOS, Debian, Fedora, Ubuntu

Stars: ✭ 847 (-68.25%)

Mutual labels: spark

Elephas

Distributed Deep learning with Keras & Spark

Stars: ✭ 1,521 (-42.99%)

Mutual labels: spark

Javainterview

最全的Java技术知识点，以及Java源码分析。为开源贡献自己的一份力。

Stars: ✭ 154 (-94.23%)

Mutual labels: bigdata

Horovod

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Stars: ✭ 11,943 (+347.64%)

Mutual labels: spark

Ds Cheatsheets

List of Data Science Cheatsheets to rule the world

Stars: ✭ 9,452 (+254.27%)

Mutual labels: spark

Avro Hadoop Starter

Example MapReduce jobs in Java, Hive, Pig, and Hadoop Streaming that work on Avro data.

Stars: ✭ 110 (-95.88%)

Mutual labels: mapreduce

Redis Stream Demo

Demo for Redis Streams

Stars: ✭ 24 (-99.1%)

Mutual labels: stream-processing

Sagemaker Spark

A Spark library for Amazon SageMaker.

Stars: ✭ 219 (-91.79%)

Mutual labels: spark

Digitrecognizer

Java Convolutional Neural Network example for Hand Writing Digit Recognition

Stars: ✭ 23 (-99.14%)

Mutual labels: spark

Java learning practice

java 进阶之路：面试高频算法、akka、多线程、NIO、Netty、SpringBoot、Spark&&Flink 等

Stars: ✭ 110 (-95.88%)

Mutual labels: spark

10 Weeks

10-weeks of technology exploration

Stars: ✭ 22 (-99.18%)

Mutual labels: bigdata

Spark.jl

Julia binding for Apache Spark

Stars: ✭ 153 (-94.27%)

Mutual labels: spark

Spark.fish

▁▂▄▆▇█▇▆▄▂▁

Stars: ✭ 229 (-91.42%)

Mutual labels: spark

Shifu

An end-to-end machine learning and data mining framework on Hadoop

Stars: ✭ 207 (-92.24%)

Mutual labels: bigdata

Log Anomaly Detector

Log Anomaly Detection - Machine learning to detect abnormal events logs

Stars: ✭ 169 (-93.67%)

Mutual labels: stream-processing

Aliyun Emapreduce Datasources

Extended datasource support for Spark/Hadoop on Aliyun E-MapReduce.

Stars: ✭ 132 (-95.05%)

Mutual labels: spark

Dataspherestudio

DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.

Stars: ✭ 1,195 (-55.21%)

Mutual labels: spark

Books

技术书籍等

Stars: ✭ 110 (-95.88%)

Mutual labels: bigdata

Azuredatabricksbestpractices

Version 1 of Technical Best Practices of Azure Databricks based on real world Customer and Technical SME inputs

Stars: ✭ 186 (-93.03%)

Mutual labels: spark

Szt Bigdata

深圳地铁大数据客流分析系统🚇🚄🌟

Stars: ✭ 826 (-69.04%)

Mutual labels: spark

Flinkstreamsql

基于开源的flink，对其实时sql进行扩展；主要实现了流与维表的join，支持原生flink SQL所有的语法

Stars: ✭ 1,682 (-36.96%)

Mutual labels: bigdata

Powderkeg

Live-coding the cluster!

Stars: ✭ 152 (-94.3%)

Mutual labels: spark

Goodreads etl pipeline

An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.

Stars: ✭ 793 (-70.28%)

Mutual labels: spark

Parquet Index

Spark SQL index for Parquet tables

Stars: ✭ 109 (-95.91%)

Mutual labels: spark

Simple It English

Simple-IT-English: smart wordbook from community for community

Stars: ✭ 233 (-91.27%)

Mutual labels: bigdata

Go Kafka Example

Golang Kafka consumer and producer example

Stars: ✭ 108 (-95.95%)

Mutual labels: stream-processing

Lpa Detector

Optimize and improve the Label propagation algorithm

Stars: ✭ 75 (-97.19%)

Mutual labels: spark

Twitwork

Monitor twitter stream

Stars: ✭ 133 (-95.01%)

Mutual labels: bigdata

Siddhi

Stream Processing and Complex Event Processing Engine

Stars: ✭ 1,185 (-55.58%)

Mutual labels: stream-processing

Sparkctr

CTR prediction model based on spark(LR, GBDT, DNN)

Stars: ✭ 740 (-72.26%)

Mutual labels: spark

Pyspark Cheatsheet

🐍 Quick reference guide to common patterns & functions in PySpark.

Stars: ✭ 108 (-95.95%)

Mutual labels: spark

Kafka Storm Starter

Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.

Stars: ✭ 728 (-72.71%)

Mutual labels: spark

Roaringbitmap

A better compressed bitset in Java

Stars: ✭ 2,460 (-7.8%)

Mutual labels: spark

Spark Twitter Stream Example

"Sentiment analysis" on a live Twitter feed with Apache Spark and Apache Bahir

Stars: ✭ 73 (-97.26%)

Mutual labels: spark

Spark Structured Streaming Examples

Spark Structured Streaming / Kafka / Cassandra / Elastic