80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.

Stars: ✭ 406 (+121.86%)

Mutual labels: spark

Spark Tdd Example

A simple Spark TDD example

Stars: ✭ 23 (-87.43%)

Mutual labels: spark

Digitrecognizer

Java Convolutional Neural Network example for Hand Writing Digit Recognition

Stars: ✭ 23 (-87.43%)

Mutual labels: spark

Uproot4

ROOT I/O in pure Python and NumPy.

Stars: ✭ 80 (-56.28%)

Mutual labels: bigdata

Redash

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

Stars: ✭ 20,147 (+10909.29%)

Mutual labels: spark

Xsql

Unified SQL Analytics Engine Based on SparkSQL

Stars: ✭ 176 (-3.83%)

Mutual labels: spark

Bigdl

Building Large-Scale AI Applications for Distributed Big Data

Stars: ✭ 3,813 (+1983.61%)

Mutual labels: spark

Docker Spark

🚢 Docker image for Apache Spark

Stars: ✭ 78 (-57.38%)

Mutual labels: spark

Wedatasphere

WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!

Stars: ✭ 372 (+103.28%)

Mutual labels: spark

Scala Samples

There are pieces of scala code that explain Scala syntax and related things - like what you can do with all this

Stars: ✭ 125 (-31.69%)

Mutual labels: spark

Spark Website

Apache Spark Website

Stars: ✭ 75 (-59.02%)

Mutual labels: spark

Athenacli

AthenaCLI is a CLI tool for AWS Athena service that can do auto-completion and syntax highlighting.

Stars: ✭ 151 (-17.49%)

Mutual labels: bigdata

Kylo

Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.

Stars: ✭ 916 (+400.55%)

Mutual labels: spark

Sparkler

Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.

Stars: ✭ 362 (+97.81%)

Mutual labels: spark

Ds Cheatsheets

List of Data Science Cheatsheets to rule the world

Stars: ✭ 9,452 (+5065.03%)

Mutual labels: spark

Sparkstreaming

Spark Streaming+Flume+Kafka+HBase+Hadoop+Zookeeper实现实时日志分析统计；SpringBoot+Echarts实现数据可视化展示

Stars: ✭ 349 (+90.71%)

Mutual labels: spark

Spark Alchemy

Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive

Stars: ✭ 122 (-33.33%)

Mutual labels: spark

Oap

Optimized Analytics Package for Spark* Platform

Stars: ✭ 343 (+87.43%)

Mutual labels: spark

Javainterview

最全的Java技术知识点，以及Java源码分析。为开源贡献自己的一份力。

Stars: ✭ 154 (-15.85%)

Mutual labels: bigdata

Aliyun Emapreduce Datasources

Extended datasource support for Spark/Hadoop on Aliyun E-MapReduce.

Stars: ✭ 132 (-27.87%)

Mutual labels: spark

Spark Terasort

Stars: ✭ 101 (-44.81%)

Mutual labels: spark

10 Weeks

10-weeks of technology exploration

Stars: ✭ 22 (-87.98%)

Mutual labels: bigdata

Scalnet

A Scala wrapper for Deeplearning4j, inspired by Keras. Scala + DL + Spark + GPUs

Stars: ✭ 342 (+86.89%)

Mutual labels: spark

Linkis

Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.

Stars: ✭ 2,323 (+1169.4%)

Mutual labels: spark

Api.rss

RSS as RESTful. This service allows you to transform RSS feed into an awesome API.

Stars: ✭ 340 (+85.79%)

Mutual labels: bigdata

Spark Twitter Stream Example

"Sentiment analysis" on a live Twitter feed with Apache Spark and Apache Bahir

Stars: ✭ 73 (-60.11%)

Mutual labels: spark

Wirbelsturm

Wirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data tech like Kafka.

Stars: ✭ 332 (+81.42%)

Mutual labels: spark

Zparkio

Boiler plate framework to use Spark and ZIO together.

Stars: ✭ 121 (-33.88%)

Mutual labels: spark

Sparklint

A tool for monitoring and tuning Spark jobs for efficiency.

Stars: ✭ 316 (+72.68%)

Mutual labels: spark

Kamu Cli

Next generation tool for decentralized exchange and transformation of semi-structured data

Stars: ✭ 69 (-62.3%)

Mutual labels: spark

Clickhouse Native Jdbc

ClickHouse Native Protocol JDBC implementation

Stars: ✭ 310 (+69.4%)

Mutual labels: spark

Avro

Apache Avro is a data serialization system.

Stars: ✭ 2,005 (+995.63%)

Mutual labels: bigdata

Uproot3

ROOT I/O in pure Python and NumPy.

Stars: ✭ 312 (+70.49%)

Mutual labels: bigdata

Crayon

Simple framework agnostic UI router for SPAs

Stars: ✭ 310 (+69.4%)

Mutual labels: spark

Example Spark Kafka

Apache Spark and Apache Kafka integration example

Stars: ✭ 120 (-34.43%)

Mutual labels: spark

Usersessionbehaviorofflineanalysis

四川大学拓思爱诺用户session行为数据离线分析项目

Stars: ✭ 69 (-62.3%)

Mutual labels: spark

Awesome Ada

A curated list of awesome resources related to the Ada and SPARK programming language

Stars: ✭ 299 (+63.39%)

Mutual labels: spark

Transmogrifai

TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning

Stars: ✭ 2,084 (+1038.8%)

Mutual labels: spark

Spark Hbase Connector

Connect Spark to HBase for reading and writing data with ease

Stars: ✭ 299 (+63.39%)

Mutual labels: spark

Kontextfrei

Writing application logic for Spark jobs that can be unit-tested without a SparkContext

Stars: ✭ 67 (-63.39%)

Mutual labels: spark

Spark Druid Olap

Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.

Stars: ✭ 282 (+54.1%)

Mutual labels: spark

Kinesis Sql

Kinesis Connector for Structured Streaming

Stars: ✭ 120 (-34.43%)

Mutual labels: spark

Janusgraph.cn

分布式图数据库 JanusGraph 中文社区，关于 JanusGraph 的一切

Stars: ✭ 273 (+49.18%)

Mutual labels: bigdata

Rsparkling

RSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)