编程电子书，电子书，编程书籍，包括C，C#，Docker，Elasticsearch，Git，Hadoop，HeadFirst，Java，Javascript，jvm，Kafka，Linux，Maven，MongoDB，MyBatis，MySQL，Netty，Nginx，Python，RabbitMQ，Redis，Scala，Solr，Spark，Spring，SpringBoot，SpringCloud，TCPIP，Tomcat，Zookeeper，人工智能，大数据类，并发编程，数据库类，数据挖掘，新面试题，架构设计，算法系列，计算机类，设计模式，软件测试，重构优化，等更多分类

Stars: ✭ 12,009 (+59945%)

Mutual labels: spark, hadoop

Bdp Dataplatform

大数据生态解决方案数据平台：基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。

Stars: ✭ 456 (+2180%)

Mutual labels: spark, flink

Zeppelin

Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.

Stars: ✭ 5,513 (+27465%)

Mutual labels: spark, flink

Data Science Ipython Notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Stars: ✭ 22,048 (+110140%)

Mutual labels: spark, hadoop

Freestyle

A cohesive & pragmatic framework of FP centric Scala libraries

Stars: ✭ 627 (+3035%)

Mutual labels: spark, cassandra

Spark Cassandra Connector

DataStax Spark Cassandra Connector

Stars: ✭ 1,816 (+8980%)

Mutual labels: spark, cassandra

Xlearning Xdml

extremely distributed machine learning

Stars: ✭ 113 (+465%)

Mutual labels: spark, hadoop

Aliyun Emapreduce Datasources

Extended datasource support for Spark/Hadoop on Aliyun E-MapReduce.

Stars: ✭ 132 (+560%)

Mutual labels: spark, hadoop

Quill

Compile-time Language Integrated Queries for Scala

Stars: ✭ 1,998 (+9890%)

Mutual labels: spark, cassandra

Data Algorithms Book

MapReduce, Spark, Java, and Scala for Data Algorithms Book

Stars: ✭ 949 (+4645%)

Mutual labels: spark, hadoop

Interview Questions Collection

按知识领域整理面试题，包括C++、Java、Hadoop、机器学习等

Stars: ✭ 21 (+5%)

Mutual labels: spark, hadoop

Pulsar Spark

When Apache Pulsar meets Apache Spark

Stars: ✭ 55 (+175%)

Mutual labels: spark, flink

Docker Hadoop

A Docker container with a full Hadoop cluster setup with Spark and Zeppelin

Stars: ✭ 54 (+170%)

Mutual labels: spark, hadoop

Rumble

⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more

Stars: ✭ 58 (+190%)

Mutual labels: spark, hdfs

Javaorbigdata Interview

Java开发者或者大数据开发者面试知识点整理

Stars: ✭ 203 (+915%)

Mutual labels: spark, hadoop

Sparkrdma

RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark

Stars: ✭ 215 (+975%)

Mutual labels: spark, hadoop

docker-swarm-vagrant

Getting started with Docker swarm

Stars: ✭ 20 (+0%)

Mutual labels: vagrant, cluster

Apache Spark Hands On

Educational notes,Hands on problems w/ solutions for hadoop ecosystem

Stars: ✭ 74 (+270%)

Mutual labels: spark, hadoop

Hops Examples

Examples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops

Stars: ✭ 84 (+320%)

Mutual labels: spark, flink

Kamu Cli

Next generation tool for decentralized exchange and transformation of semi-structured data

Stars: ✭ 69 (+245%)

Mutual labels: spark, flink

Kubernetes Vagrant Coreos Cluster

Kubernetes cluster (for testing purposes) made easy with Vagrant and CoreOS.

Stars: ✭ 598 (+2890%)

Mutual labels: vagrant, cluster

Wirbelsturm

Wirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data tech like Kafka.

Stars: ✭ 332 (+1560%)

Mutual labels: vagrant, spark

K8s Vagrant Multi Node

A Kubernetes Vagrant Multi node environment using kubeadm.

Stars: ✭ 141 (+605%)

Mutual labels: vagrant, cluster

Ansible Vagrant Examples

Ansible examples using Vagrant to deploy to local VMs.

Stars: ✭ 1,913 (+9465%)

Mutual labels: vagrant, vms

Cloudflow

Cloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.

Stars: ✭ 278 (+1290%)

Mutual labels: spark, flink

Teddy

Spark Streaming监控平台，支持任务部署与告警、自启动

Stars: ✭ 120 (+500%)

Mutual labels: spark, yarn

Iot Traffic Monitor

Stars: ✭ 131 (+555%)

Mutual labels: spark, cassandra

Elassandra

Elassandra = Elasticsearch + Apache Cassandra

Stars: ✭ 1,610 (+7950%)

Mutual labels: spark, cassandra

Ecommercerecommendsystem

商品大数据实时推荐系统。前端：Vue + TypeScript + ElementUI，后端 Spring + Spark

Stars: ✭ 139 (+595%)

Mutual labels: spark, flink

Quicksql

A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources

Stars: ✭ 1,821 (+9005%)

Mutual labels: spark, flink

nodejs-dev-vm

DEPRECATED Simple Node.js Development VM using Vagrant + VirtualBox + Ansible

Stars: ✭ 25 (+25%)

Mutual labels: vagrant, vms

Sparkstreaming

💥 🚀 封装sparkstreaming动态调节batch time(有数据就执行计算)；🚀 支持运行过程中增删topic；🚀 封装sparkstreaming 1.6 - kafka 010 用以支持 SSL。

Stars: ✭ 179 (+795%)

Mutual labels: spark, flink

Deeplearning4j

Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learni…

Stars: ✭ 12,277 (+61285%)

Mutual labels: spark, hadoop

Gimel

Big Data Processing Framework - Unified Data API or SQL on Any Storage

Stars: ✭ 216 (+980%)

Mutual labels: spark, cassandra

Spark Structured Streaming Examples

Spark Structured Streaming / Kafka / Cassandra / Elastic

Stars: ✭ 168 (+740%)

Mutual labels: spark, cassandra

Agile data code 2

Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition

Stars: ✭ 413 (+1965%)

Mutual labels: vagrant, spark

dpkb

大数据相关内容汇总，包括分布式存储引擎、分布式计算引擎、数仓建设等。关键词：Hadoop、HBase、ES、Kudu、Hive、Presto、Spark、Flink、Kylin、ClickHouse

Stars: ✭ 123 (+515%)

Mutual labels: hadoop, flink

HDFS-Netdisc

基于Hadoop的分布式云存储系统 🌴

Stars: ✭ 56 (+180%)

Mutual labels: hadoop, hdfs

ansible-role-test-vms

DEPRECATED - A Vagrant configuration to test Ansible roles against a variety of Linux distributions.

Stars: ✭ 42 (+110%)

Mutual labels: vagrant, vms

yarn-prometheus-exporter

Export Hadoop YARN (resource-manager) metrics in prometheus format

Stars: ✭ 44 (+120%)

Mutual labels: yarn, hadoop

teraslice

Scalable data processing pipelines in JavaScript

Stars: ✭ 48 (+140%)

Mutual labels: hadoop, hdfs

hive to es

同步Hive数据仓库数据到Elasticsearch的小工具

Stars: ✭ 21 (+5%)

Mutual labels: hadoop, hdfs

kubernetes-cluster

Vagrant As Automation Script

Stars: ✭ 34 (+70%)

Mutual labels: vagrant, vms

usergrid-docker

Build and run Usergrid 2.1 using Docker

Stars: ✭ 41 (+105%)

Mutual labels: vagrant, cassandra

beanszoo

Distributed Java micro-services using ZooKeeper

Stars: ✭ 12 (-40%)

Mutual labels: yarn, hadoop

kubernetes-basico

Demonstração dos componentes do Kubernetes

Stars: ✭ 26 (+30%)

Mutual labels: vagrant, cluster

flink-spark-submiter

从本地IDEA提交Flink/Spark任务到Yarn/k8s集群

Stars: ✭ 157 (+685%)

Mutual labels: yarn, flink

hadoopoffice

HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)

Stars: ✭ 56 (+180%)

Mutual labels: hadoop, flink

datasqueeze

Hadoop utility to compact small files

Stars: ✭ 18 (-10%)

Mutual labels: hadoop, hdfs

basin

Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser

Stars: ✭ 25 (+25%)

Mutual labels: spark, hadoop

cassandra.realtime

Different ways to process data into Cassandra in realtime with technologies such as Kafka, Spark, Akka, Flink

Stars: ✭ 25 (+25%)

Mutual labels: cassandra, flink

Java learning practice

java 进阶之路：面试高频算法、akka、多线程、NIO、Netty、SpringBoot、Spark&&Flink 等

Stars: ✭ 110 (+450%)

Mutual labels: spark, flink

kafka-connect-fs

Kafka Connect FileSystem Connector