QuicksqlA Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
Stars: ✭ 1,821 (+2539.13%)
HadoopcryptoledgerHadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Stars: ✭ 126 (+82.61%)
SqlcellSQLCell is a magic function for the Jupyter Notebook that executes raw, parallel, parameterized SQL queries with the ability to accept Python values as parameters and assign output data to Python variables while concurrently running Python code. And *much* more.
Stars: ✭ 145 (+110.14%)
Alchemy给flink开发的web系统。支持页面上定义udf,进行sql和jar任务的提交;支持source、sink、job的管理;可以管理openshift上的flink集群
Stars: ✭ 264 (+282.61%)
CatenaCatena is a distributed database based on a blockchain, accessible using SQL.
Stars: ✭ 302 (+337.68%)
Pulsar FlinkElastic data processing with Apache Pulsar and Apache Flink
Stars: ✭ 126 (+82.61%)
ChainqueryChainquery parses and syncs the LBRY blockchain data into structured SQL
Stars: ✭ 2,497 (+3518.84%)
whyqddata wrangling simplicity, complete audit transparency, and at speed
Stars: ✭ 16 (-76.81%)
JustenoughscalaforsparkA tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
Stars: ✭ 538 (+679.71%)
SparkmonitorMonitor Apache Spark from Jupyter Notebook
Stars: ✭ 154 (+123.19%)
Big WhaleSpark、Flink等离线任务的调度以及实时任务的监控
Stars: ✭ 163 (+136.23%)
CovenantsqlA decentralized, trusted, high performance, SQL database with blockchain features
Stars: ✭ 1,148 (+1563.77%)
Fiflowflink-sql 在 flink 上运行 sql 和 构建数据流的平台 基于 apache flink 1.10.0
Stars: ✭ 100 (+44.93%)
BeakerxBeaker Extensions for Jupyter Notebook
Stars: ✭ 2,594 (+3659.42%)
BlockapiA general framework for blockchain analytics
Stars: ✭ 111 (+60.87%)
Spark Jupyter AwsA guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Stars: ✭ 259 (+275.36%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+117.39%)
God Of Bigdata专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+8607.25%)
Presto EthereumPresto Ethereum Connector -- SQL on Ethereum
Stars: ✭ 450 (+552.17%)
ScriptisScriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Stars: ✭ 696 (+908.7%)
ZeppelinWeb-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+7889.86%)
SparkmagicJupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+1282.61%)
Vagrant ProjectsVagrant projects for various use-cases with Spark, Zeppelin, IPython / Jupyter, SparkR
Stars: ✭ 34 (-50.72%)
Ether sqlA python library to push ethereum blockchain data into an sql database.
Stars: ✭ 41 (-40.58%)
WaterdropProduction Ready Data Integration Product, documentation:
Stars: ✭ 1,856 (+2589.86%)
Sparkstreaming💥 🚀 封装sparkstreaming动态调节batch time(有数据就执行计算);🚀 支持运行过程中增删topic;🚀 封装sparkstreaming 1.6 - kafka 010 用以支持 SSL。
Stars: ✭ 179 (+159.42%)
Flinkstreamsql基于开源的flink,对其实时sql进行扩展;主要实现了流与维表的join,支持原生flink SQL所有的语法
Stars: ✭ 1,682 (+2337.68%)
Parquet IndexSpark SQL index for Parquet tables
Stars: ✭ 109 (+57.97%)
Flink Learningflink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
Stars: ✭ 11,378 (+16389.86%)
Flink Sql CookbookThe Apache Flink SQL Cookbook is a curated collection of examples, patterns, and use cases of Apache Flink SQL. Many of the recipes are completely self-contained and can be run in Ververica Platform as is.
Stars: ✭ 189 (+173.91%)
XsqlUnified SQL Analytics Engine Based on SparkSQL
Stars: ✭ 176 (+155.07%)
LinkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,323 (+3266.67%)
fastdata-clusterFast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)
Stars: ✭ 20 (-71.01%)
kuwalaKuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data sc…
Stars: ✭ 474 (+586.96%)
CloudflowCloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.
Stars: ✭ 278 (+302.9%)
FeatranA Scala feature transformation library for data science and machine learning
Stars: ✭ 420 (+508.7%)
Enterprise gatewayA lightweight, multi-tenant, scalable and secure gateway that enables Jupyter Notebooks to share resources across distributed clusters such as Apache Spark, Kubernetes and others.
Stars: ✭ 412 (+497.1%)
Bdp Dataplatform大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Stars: ✭ 456 (+560.87%)
KyuubiKyuubi is a unified multi-tenant JDBC interface for large-scale data processing and analytics, built on top of Apache Spark
Stars: ✭ 363 (+426.09%)
Curriculum👩🏫 👨🏫 The open-source curriculum of Enki!
Stars: ✭ 624 (+804.35%)
DatafusionDataFusion has now been donated to the Apache Arrow project
Stars: ✭ 611 (+785.51%)
Bigdataguide大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (+1084.06%)
MetorikkuA simplified, lightweight ETL Framework based on Apache Spark
Stars: ✭ 361 (+423.19%)
SparkApache Spark - A unified analytics engine for large-scale data processing
Stars: ✭ 31,618 (+45723.19%)
Bigdata Interview🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+1142.03%)
Repository个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (+33.33%)
AlmondA Scala kernel for Jupyter
Stars: ✭ 1,354 (+1862.32%)
SylphStream computing platform for bigdata
Stars: ✭ 362 (+424.64%)
Pulsar SparkWhen Apache Pulsar meets Apache Spark
Stars: ✭ 55 (-20.29%)
AthenaxSQL-based streaming analytics platform at scale
Stars: ✭ 1,178 (+1607.25%)