Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.

Stars: ✭ 247 (-41.33%)

Mutual labels: spark, spark-streaming

Clickhouse Native Jdbc

ClickHouse Native Protocol JDBC implementation

Stars: ✭ 310 (-26.37%)

Mutual labels: spark

Sparkmeasure

This is the development repository of SparkMeasure, a tool for performance troubleshooting of Apache Spark workloads. It simplifies the collection and analysis of Spark task metrics data.

Stars: ✭ 368 (-12.59%)

Mutual labels: spark

Crayon

Simple framework agnostic UI router for SPAs

Stars: ✭ 310 (-26.37%)

Mutual labels: spark

Spline

Data Lineage Tracking And Visualization Solution

Stars: ✭ 306 (-27.32%)

Mutual labels: spark

Big data architect skills

一个大数据架构师应该掌握的技能

Stars: ✭ 400 (-4.99%)

Mutual labels: spark

Kyuubi

Kyuubi is a unified multi-tenant JDBC interface for large-scale data processing and analytics, built on top of Apache Spark

Stars: ✭ 363 (-13.78%)

Mutual labels: spark

Awesome Ada

A curated list of awesome resources related to the Ada and SPARK programming language

Stars: ✭ 299 (-28.98%)

Mutual labels: spark

Spark Structured Streaming Book

The Internals of Spark Structured Streaming

Stars: ✭ 371 (-11.88%)

Mutual labels: spark

Learningsparkv2

This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]

Stars: ✭ 307 (-27.08%)

Mutual labels: spark

Tutorial

Java全栈知识架构体系总结

Stars: ✭ 407 (-3.33%)

Mutual labels: spark

Delta

An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.

Stars: ✭ 3,903 (+827.08%)

Mutual labels: spark

Sidekick

High Performance HTTP Sidecar Load Balancer

Stars: ✭ 366 (-13.06%)

Mutual labels: spark

Zat

Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark

Stars: ✭ 303 (-28.03%)

Mutual labels: spark

Enterprise gateway

A lightweight, multi-tenant, scalable and secure gateway that enables Jupyter Notebooks to share resources across distributed clusters such as Apache Spark, Kubernetes and others.

Stars: ✭ 412 (-2.14%)

Mutual labels: spark

Elasticluster

Create clusters of VMs on the cloud and configure them with Ansible.

Stars: ✭ 298 (-29.22%)

Mutual labels: spark

Metorikku

A simplified, lightweight ETL Framework based on Apache Spark

Stars: ✭ 361 (-14.25%)

Mutual labels: spark

Spark Hbase Connector

Connect Spark to HBase for reading and writing data with ease

Stars: ✭ 299 (-28.98%)

Mutual labels: spark

Spark Notebook

Interactive and Reactive Data Science using Scala and Spark.

Stars: ✭ 3,081 (+631.83%)

Mutual labels: spark

Iceberg

Iceberg is a table format for large, slow-moving tabular data

Stars: ✭ 393 (-6.65%)

Mutual labels: spark

Sparkler

Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.

Stars: ✭ 362 (-14.01%)

Mutual labels: spark

Spark Druid Olap

Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.

Stars: ✭ 282 (-33.02%)

Mutual labels: spark

Cloudflow

Cloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.

Stars: ✭ 278 (-33.97%)

Mutual labels: spark

Sylph

Stream computing platform for bigdata

Stars: ✭ 362 (-14.01%)

Mutual labels: spark-streaming

Hbase Rdd

Spark RDD to read, write and delete from HBase