flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例，还有 Flink 落地应用的大型项目案例（PVUV、日志存储、百亿数据实时去重、监控告警）分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》

Stars: ✭ 11,378 (+7640.14%)

Mutual labels: spark

Waimak

Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.

Stars: ✭ 60 (-59.18%)

Mutual labels: spark

Wedatasphere

WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!

Stars: ✭ 372 (+153.06%)

Mutual labels: spark

Ammonite Spark

Run spark calculations from Ammonite

Stars: ✭ 88 (-40.14%)

Mutual labels: spark

awesome-AI-kubernetes

❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc

Stars: ✭ 95 (-35.37%)

Mutual labels: spark

Mare

MaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.

Stars: ✭ 11 (-92.52%)

Mutual labels: spark

Kinesis Sql

Kinesis Connector for Structured Streaming

Stars: ✭ 120 (-18.37%)

Mutual labels: spark

swordfish

Open-source distribute workflow schedule tools, also support streaming task.

Stars: ✭ 35 (-76.19%)

Mutual labels: spark

Bigdata Interview

🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结

Stars: ✭ 857 (+482.99%)

Mutual labels: spark

Spark-Ar

Resources for Spark AR

Stars: ✭ 43 (-70.75%)

Mutual labels: spark

fastdata-cluster

Fast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)

Stars: ✭ 20 (-86.39%)

Mutual labels: spark

Tiledb Vcf

Efficient variant-call data storage and retrieval library using the TileDB storage library.

Stars: ✭ 26 (-82.31%)

Mutual labels: spark

spark-stringmetric

Spark functions to run popular phonetic and string matching algorithms

Stars: ✭ 51 (-65.31%)

Mutual labels: spark

Ecommercerecommendsystem

商品大数据实时推荐系统。前端：Vue + TypeScript + ElementUI，后端 Spring + Spark

Stars: ✭ 139 (-5.44%)

Mutual labels: spark

pyspark-cheatsheet

PySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster

Stars: ✭ 115 (-21.77%)

Mutual labels: pyspark

Mobius

C# and F# language binding and extensions to Apache Spark

Stars: ✭ 929 (+531.97%)

Mutual labels: spark

dlsa

Distributed least squares approximation (dlsa) implemented with Apache Spark

Stars: ✭ 25 (-82.99%)

Mutual labels: pyspark

Cuesheet

A framework for writing Spark 2.x applications in a pretty way

Stars: ✭ 86 (-41.5%)

Mutual labels: spark

lineage

Generate beautiful documentation for your data pipelines in markdown format

Stars: ✭ 16 (-89.12%)

Mutual labels: pyspark

Pyspark Setup Demo

Demo of PySpark and Jupyter Notebook with the Jupyter Docker Stacks

Stars: ✭ 24 (-83.67%)

Mutual labels: pyspark

Ibis

A pandas-like deferred expression system, with first-class SQL support

Stars: ✭ 1,630 (+1008.84%)

Mutual labels: spark

Spark Structured Streaming Book

The Internals of Spark Structured Streaming

Stars: ✭ 371 (+152.38%)

Mutual labels: spark

Data Science Cookbook

🎓 Jupyter notebooks from UFC data science course

Stars: ✭ 60 (-59.18%)

Mutual labels: spark

Sparkmeasure

This is the development repository of SparkMeasure, a tool for performance troubleshooting of Apache Spark workloads. It simplifies the collection and analysis of Spark task metrics data.

Stars: ✭ 368 (+150.34%)

Mutual labels: spark

Sidekick

High Performance HTTP Sidecar Load Balancer

Stars: ✭ 366 (+148.98%)

Mutual labels: spark

Seldon Server

Machine Learning Platform and Recommendation Engine built on Kubernetes

Stars: ✭ 1,435 (+876.19%)

Mutual labels: spark

Zemberek Nlp Server

Zemberek Türkçe NLP Java Kütüphanesi üzerine REST Docker Sunucu

Stars: ✭ 60 (-59.18%)

Mutual labels: spark

Kyuubi

Kyuubi is a unified multi-tenant JDBC interface for large-scale data processing and analytics, built on top of Apache Spark

Stars: ✭ 363 (+146.94%)

Mutual labels: spark

Metorikku

A simplified, lightweight ETL Framework based on Apache Spark

Stars: ✭ 361 (+145.58%)

Mutual labels: spark

Petastorm

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.

Stars: ✭ 1,108 (+653.74%)

Mutual labels: pyspark

301-360 of 458 similar projects

first

‹

›