LabsResearch on distributed system
Stars: ✭ 73 (-37.07%)
LogislandScalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.
Stars: ✭ 97 (-16.38%)
Luigi WarehouseA luigi powered analytics / warehouse stack
Stars: ✭ 72 (-37.93%)
Spring Shiro SparkSpring-Shiro-Spark是Spring-Boot Hibernate Spark Spark-SQL Shiro iView VueJs... ...的集成尝试
Stars: ✭ 114 (-1.72%)
Rumble⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-50%)
Parquet IndexSpark SQL index for Parquet tables
Stars: ✭ 109 (-6.03%)
Ammonite SparkRun spark calculations from Ammonite
Stars: ✭ 88 (-24.14%)
RsparklingRSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
Stars: ✭ 65 (-43.97%)
Repository个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (-20.69%)
ElassandraElassandra = Elasticsearch + Apache Cassandra
Stars: ✭ 1,610 (+1287.93%)
Big Data🔧 Use dplyr to analyze Big Data 🐘
Stars: ✭ 93 (-19.83%)
RoffildlibraryLibrary for MQL5 (MetaTrader) with Python, Java, Apache Spark, AWS
Stars: ✭ 63 (-45.69%)
IbisA pandas-like deferred expression system, with first-class SQL support
Stars: ✭ 1,630 (+1305.17%)
Python BigdataData science and Big Data with Python
Stars: ✭ 112 (-3.45%)
Seldon ServerMachine Learning Platform and Recommendation Engine built on Kubernetes
Stars: ✭ 1,435 (+1137.07%)
Pulsar SparkWhen Apache Pulsar meets Apache Spark
Stars: ✭ 55 (-52.59%)
Docker HadoopA Docker container with a full Hadoop cluster setup with Spark and Zeppelin
Stars: ✭ 54 (-53.45%)
WaimakWaimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Stars: ✭ 60 (-48.28%)
Xlearning Xdmlextremely distributed machine learning
Stars: ✭ 113 (-2.59%)
Flink Learningflink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
Stars: ✭ 11,378 (+9708.62%)
Awesome PulsarA curated list of Pulsar tools, integrations and resources.
Stars: ✭ 57 (-50.86%)
Spark Nlp ModelsModels and Pipelines for the Spark NLP library
Stars: ✭ 88 (-24.14%)
TeddySpark Streaming监控平台,支持任务部署与告警、自启动
Stars: ✭ 120 (+3.45%)
Utils4sscala、spark使用过程中,各种测试用例以及相关资料整理
Stars: ✭ 1,070 (+822.41%)
Spark Submit UiThis is a based on playframwork for submit spark app
Stars: ✭ 53 (-54.31%)
Awesome SparkA curated list of awesome Apache Spark packages and resources.
Stars: ✭ 1,061 (+814.66%)
LogigskA Linux based software package to control led's on Logitech G910, G810, G610 and G410.
Stars: ✭ 107 (-7.76%)
CuesheetA framework for writing Spark 2.x applications in a pretty way
Stars: ✭ 86 (-25.86%)
Spark NkpNatural Korean Processor for Apache Spark
Stars: ✭ 50 (-56.9%)
FlintWebex Bot SDK for Node.js (deprecated in favor of https://github.com/webex/webex-bot-node-framework)
Stars: ✭ 85 (-26.72%)
Awesome Recommendation EngineThe purpose of this tiny project is to put things together with the know how that i learned from the course big data expert from formacionhadoop.com The idea is to show how to play with apache spark streaming, kafka,mongo, spark machine learning algorithms.
Stars: ✭ 47 (-59.48%)
ArchivesparkAn Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.
Stars: ✭ 111 (-4.31%)
Spark On K8s OperatorKubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Stars: ✭ 1,780 (+1434.48%)
Hops ExamplesExamples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops
Stars: ✭ 84 (-27.59%)
Spark TdaSparkTDA is a package for Apache Spark providing Topological Data Analysis Functionalities.
Stars: ✭ 45 (-61.21%)
Spark StatesCustom state store providers for Apache Spark
Stars: ✭ 83 (-28.45%)
Delta ArchitectureStreaming data changes to a Data Lake with Debezium and Delta Lake pipeline
Stars: ✭ 43 (-62.93%)
SparktutorialSource code for James Lee's Aparch Spark with Java course
Stars: ✭ 105 (-9.48%)
Hadoop cookbookCookbook to install Hadoop 2.0+ using Chef
Stars: ✭ 82 (-29.31%)
GatkOfficial code repository for GATK versions 4 and up
Stars: ✭ 1,002 (+763.79%)
PixiedustPython Helper library for Jupyter Notebooks
Stars: ✭ 998 (+760.34%)
Cube.js📊 Cube — Open-Source Analytics API for Building Data Apps
Stars: ✭ 11,983 (+10230.17%)
ElephasDistributed Deep learning with Keras & Spark
Stars: ✭ 1,521 (+1211.21%)
SplashSplash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange
Stars: ✭ 105 (-9.48%)