All Projects → Hadoop_cookbook → Similar Projects or Alternatives

1052 Open source projects that are alternatives of or similar to Hadoop_cookbook

Hadoop Pot
A scalable Apache Hadoop-based implementation of the Pooled Time Series video similarity algorithm based on M. Ryoo et al paper CVPR 2015.
Stars: ✭ 8 (-90.24%)
Mutual labels:  hadoop
Books Recommendation
程序员进阶书籍(视频),持续更新(Programmer Books)
Stars: ✭ 558 (+580.49%)
Mutual labels:  zookeeper
shamash
Autoscaling for Google Cloud Dataproc
Stars: ✭ 31 (-62.2%)
Mutual labels:  spark
hiveql-parser
HiveQL Parser. Parse HiveQL code and print AST in JSON format if success, else print well formed syntax error message.
Stars: ✭ 25 (-69.51%)
Mutual labels:  hive
Wirbelsturm
Wirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data tech like Kafka.
Stars: ✭ 332 (+304.88%)
Mutual labels:  spark
Meme Generator
MemeGen is a web application where the user gives an image as input and our tool generates a meme at one click for the user.
Stars: ✭ 57 (-30.49%)
Mutual labels:  chef-cookbook
fluent-plugin-webhdfs
Hadoop WebHDFS output plugin for Fluentd
Stars: ✭ 57 (-30.49%)
Mutual labels:  hadoop
Gather Deployment
Gathers scalable tensorflow and infrastructure deployment
Stars: ✭ 326 (+297.56%)
Mutual labels:  hadoop
Tiledb Vcf
Efficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (-68.29%)
Mutual labels:  spark
W2v
Word2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-21.95%)
Mutual labels:  spark
House
Proof of Concept and Research repository.
Stars: ✭ 37 (-54.88%)
Mutual labels:  chef
Spark Daria
Essential Spark extensions and helper methods ✨😲
Stars: ✭ 553 (+574.39%)
Mutual labels:  spark
Search Ads Web Service
Online search advertisement platform & Realtime Campaign Monitoring [Maybe Deprecated]
Stars: ✭ 30 (-63.41%)
Mutual labels:  spark
chef-chrome
Chef cookbook to install Google Chrome browser
Stars: ✭ 16 (-80.49%)
Mutual labels:  chef
Kazoo
Kazoo is a high-level Python library that makes it easier to use Apache Zookeeper.
Stars: ✭ 1,161 (+1315.85%)
Mutual labels:  zookeeper
Dis Seckill
👊SpringBoot+Zookeeper+Dubbo打造分布式高并发商品秒杀系统
Stars: ✭ 315 (+284.15%)
Mutual labels:  zookeeper
Cloud2020
SpringCloud
Stars: ✭ 550 (+570.73%)
Mutual labels:  zookeeper
np-flink
flink详细学习实践
Stars: ✭ 26 (-68.29%)
Mutual labels:  hbase
Spark Swagger
Spark (http://sparkjava.com/) support for Swagger (https://swagger.io/)
Stars: ✭ 25 (-69.51%)
Mutual labels:  spark
Sparklint
A tool for monitoring and tuning Spark jobs for efficiency.
Stars: ✭ 316 (+285.37%)
Mutual labels:  spark
big-data-exploration
[Archive] Intern project - Big Data Exploration using MongoDB - This Repository is NOT a supported MongoDB product
Stars: ✭ 43 (-47.56%)
Mutual labels:  hadoop
Pulsar Spark
When Apache Pulsar meets Apache Spark
Stars: ✭ 55 (-32.93%)
Mutual labels:  spark
Cook
Fair job scheduler on Kubernetes and Mesos for batch workloads and Spark
Stars: ✭ 314 (+282.93%)
Mutual labels:  spark
spark-gradle-template
Apache Spark in your IDE with gradle
Stars: ✭ 39 (-52.44%)
Mutual labels:  spark
datalake-etl-pipeline
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (-52.44%)
Mutual labels:  hadoop
Choregraphie
Choregraphie offers primitive to coordinate convergence of chef resources.
Stars: ✭ 24 (-70.73%)
Mutual labels:  chef
Mleap
MLeap: Deploy ML Pipelines to Production
Stars: ✭ 1,232 (+1402.44%)
Mutual labels:  spark
Optimus
🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+1102.44%)
Mutual labels:  spark
Javafamily
【Java面试+Java学习指南】 一份涵盖大部分Java程序员所需要掌握的核心知识。
Stars: ✭ 28,668 (+34860.98%)
Mutual labels:  zookeeper
hadoop-data-ingestion-tool
OLAP and ETL of Big Data
Stars: ✭ 17 (-79.27%)
Mutual labels:  hadoop
Cp Helm Charts
The Confluent Platform Helm charts enable you to deploy Confluent Platform services on Kubernetes for development, test, and proof of concept environments.
Stars: ✭ 539 (+557.32%)
Mutual labels:  zookeeper
ansible-cloudera-hadoop
ansible playbook to deploy cloudera hadoop components to the cluster
Stars: ✭ 51 (-37.8%)
Mutual labels:  hbase
go-solr
solr go client from sendgrid, zookeeper aware, incorporates retries
Stars: ✭ 39 (-52.44%)
Mutual labels:  zookeeper
Spark Tdd Example
A simple Spark TDD example
Stars: ✭ 23 (-71.95%)
Mutual labels:  spark
openverse-catalog
Identifies and collects data on cc-licensed content across web crawl data and public apis.
Stars: ✭ 27 (-67.07%)
Mutual labels:  spark
Delta
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
Stars: ✭ 3,903 (+4659.76%)
Mutual labels:  spark
last fm
A simple app to demonstrate a testable, maintainable, and scalable architecture for flutter. flutter_bloc, get_it, hive, and REST API are some of the tech stacks used in this project.
Stars: ✭ 134 (+63.41%)
Mutual labels:  hive
Express Microservice Starter
An express-based Node.js API bootstrapping module for building microservices.
Stars: ✭ 53 (-35.37%)
Mutual labels:  zookeeper
Spark Gbtlr
Hybrid model of Gradient Boosting Trees and Logistic Regression (GBDT+LR) on Spark
Stars: ✭ 81 (-1.22%)
Mutual labels:  spark
Easyrpc
EasyRpc is a simple, high-performance, easy-to-use RPC framework based on Netty, ZooKeeper and ProtoStuff.
Stars: ✭ 79 (-3.66%)
Mutual labels:  zookeeper
Lpa Detector
Optimize and improve the Label propagation algorithm
Stars: ✭ 75 (-8.54%)
Mutual labels:  spark
Pyspark Twitter Stream Mining
Real-time Machine Learning with Apache Spark on Twitter Public Stream
Stars: ✭ 64 (-21.95%)
Mutual labels:  spark
Real Time Stream Processing Engine
This is an example of real time stream processing using Spark Streaming, Kafka & Elasticsearch.
Stars: ✭ 37 (-54.88%)
Mutual labels:  spark
Justenoughscalaforspark
A tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
Stars: ✭ 538 (+556.1%)
Mutual labels:  spark
aix
Resources for AIX hosts
Stars: ✭ 22 (-73.17%)
Mutual labels:  chef
codes-scratch-zookeeper-netty
zk + netty 实现集群节点文件同步服务
Stars: ✭ 29 (-64.63%)
Mutual labels:  zookeeper
Awesome Ada
A curated list of awesome resources related to the Ada and SPARK programming language
Stars: ✭ 299 (+264.63%)
Mutual labels:  spark
hbase-prometheus-monitoring
No description or website provided.
Stars: ✭ 19 (-76.83%)
Mutual labels:  hbase
Usersessionbehaviorofflineanalysis
四川大学拓思爱诺用户session行为数据离线分析项目
Stars: ✭ 69 (-15.85%)
Mutual labels:  spark
Activemq
Development repository for activemq Chef Cookbook
Stars: ✭ 19 (-76.83%)
Mutual labels:  chef
data processing course
Some class materials for a data processing course using PySpark
Stars: ✭ 50 (-39.02%)
Mutual labels:  spark
Lopq
Training of Locally Optimized Product Quantization (LOPQ) models for approximate nearest neighbor search of high dimensional data in Python and Spark.
Stars: ✭ 530 (+546.34%)
Mutual labels:  spark
sitecore-packer
Packer templates for Sitecore development with IIS, SOLR and SQL Server on Windows
Stars: ✭ 19 (-76.83%)
Mutual labels:  chef
simple-rpc-plus
使用netty和zookeeper技术实现的远程调用框架
Stars: ✭ 16 (-80.49%)
Mutual labels:  zookeeper
docker-repo
A repository stores some dockerfiles or docker-compose files for quickly starting service or service cluster.
Stars: ✭ 26 (-68.29%)
Mutual labels:  zookeeper
awesome-AI-kubernetes
❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc
Stars: ✭ 95 (+15.85%)
Mutual labels:  spark
Chef Vault
chef-vault cookbook
Stars: ✭ 63 (-23.17%)
Mutual labels:  chef
kzmonitor
kafka zookeeper monitor
Stars: ✭ 34 (-58.54%)
Mutual labels:  zookeeper
Sparta
Real Time Analytics and Data Pipelines based on Spark Streaming
Stars: ✭ 513 (+525.61%)
Mutual labels:  spark
spark-druid-olap
Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.
Stars: ✭ 286 (+248.78%)
Mutual labels:  spark
601-660 of 1052 similar projects