All Projects → Docker Hadoop → Similar Projects or Alternatives

231 Open source projects that are alternatives of or similar to Docker Hadoop

Bigdata

💎🔥大数据学习笔记

Stars: ✭ 488 (-58.99%)

Mutual labels: hadoop

Kafka Connect Hdfs

Kafka Connect HDFS connector

Stars: ✭ 400 (-66.39%)

Mutual labels: hadoop

Dockerfiles

50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak, TeamCity and DevOps tools built on the major Linux distros: Alpine, CentOS, Debian, Fedora, Ubuntu

Stars: ✭ 847 (-28.82%)

Mutual labels: hadoop

H2o 3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

Stars: ✭ 5,656 (+375.29%)

Mutual labels: hadoop

Ytk Learn

Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms(GBDT, GBRT, Mixture Logistic Regression, Gradient Boosting Soft Tree, Factorization Machines, Field-aware Factorization Machines, Logistic Regression, Softmax).

Stars: ✭ 337 (-71.68%)

Mutual labels: hadoop

Storm Camel Example

Real-time analysis and visualization with Storm-AMQ-Camel-Websockets-Highcharts integration.

Stars: ✭ 28 (-97.65%)

Mutual labels: hadoop

Data Science Ipython Notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Stars: ✭ 22,048 (+1752.77%)

Mutual labels: hadoop

Base

https://www.researchgate.net/profile/Rajah_Iyer

Stars: ✭ 48 (-95.97%)

Mutual labels: hadoop

Ignite

Apache Ignite

Stars: ✭ 4,027 (+238.4%)

Mutual labels: hadoop

Floating Elephants

Docker containers for Hadoop.

Stars: ✭ 19 (-98.4%)

Mutual labels: hadoop

Tony

TonY is a framework to natively run deep learning frameworks on Apache Hadoop.

Stars: ✭ 626 (-47.39%)

Mutual labels: hadoop

Tez

Apache Tez

Stars: ✭ 313 (-73.7%)

Mutual labels: hadoop

Akkeeper

An easy way to deploy your Akka services to a distributed environment.

Stars: ✭ 30 (-97.48%)

Mutual labels: hadoop

Alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud

Stars: ✭ 5,379 (+352.02%)

Mutual labels: hadoop

Docker Hadoop

A Docker container with a full Hadoop cluster setup with Spark and Zeppelin

Stars: ✭ 54 (-95.46%)

Mutual labels: hadoop

School Of Sre

At LinkedIn, we are using this curriculum for onboarding our entry-level talents into the SRE role.

Stars: ✭ 5,141 (+332.02%)

Mutual labels: hadoop

Cdc Kafka Hadoop

MySQL to NoSQL real time dataflow

Stars: ✭ 13 (-98.91%)

Mutual labels: hadoop

Marmaray

Generic Data Ingestion & Dispersal Library for Hadoop

Stars: ✭ 414 (-65.21%)

Mutual labels: hadoop

Jumbune

Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,

Stars: ✭ 64 (-94.62%)

Mutual labels: hadoop

Iceberg

Iceberg is a table format for large, slow-moving tabular data

Stars: ✭ 393 (-66.97%)

Mutual labels: hadoop

Stormtweetssentimentd3viz

Computes and visualizes the sentiment analysis of tweets of US States in real-time using Storm.

Stars: ✭ 25 (-97.9%)

Mutual labels: hadoop

Hive

Apache Hive

Stars: ✭ 4,031 (+238.74%)

Mutual labels: hadoop

Nagios Plugins

450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...

Stars: ✭ 1,000 (-15.97%)

Mutual labels: hadoop

Gather Deployment

Gathers scalable tensorflow and infrastructure deployment

Stars: ✭ 326 (-72.61%)

Mutual labels: hadoop

Hadoop For Geoevent

ArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.

Stars: ✭ 5 (-99.58%)

Mutual labels: hadoop

Useractionanalyzeplatform

电商用户行为分析大数据平台

Stars: ✭ 645 (-45.8%)

Mutual labels: hadoop

Hadoop Book

Example source code accompanying O'Reilly's "Hadoop: The Definitive Guide" by Tom White

Stars: ✭ 3,317 (+178.74%)

Mutual labels: hadoop

Jsr203 Hadoop

A Java NIO file system provider for HDFS

Stars: ✭ 35 (-97.06%)

Mutual labels: hadoop

Javapdf

🍣100本 Java电子书技术书籍PDF(以下载阅读为荣，以点赞收藏为耻)

Stars: ✭ 609 (-48.82%)

Mutual labels: hadoop

Docker Spark Cluster

A Spark cluster setup running on Docker containers

Stars: ✭ 57 (-95.21%)

Mutual labels: hadoop

Dist Keras

Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.

Stars: ✭ 613 (-48.49%)

Mutual labels: hadoop

Data Algorithms Book

MapReduce, Spark, Java, and Scala for Data Algorithms Book

Stars: ✭ 949 (-20.25%)

Mutual labels: hadoop

Hadoop study

定期更新Hadoop生态圈中常用大数据组件文档重心依次为: Flink Solr Sparksql ES Scala Kafka Hbase/phoenix Redis Kerberos (项目包含hadoop思维导图印象笔记 Scala版本简单demo 常用工具类去敏后的train code 持续更新!!!)

Stars: ✭ 567 (-52.35%)

Mutual labels: hadoop

Src

A light-weight distributed stream computing framework for Golang

Stars: ✭ 67 (-94.37%)

Mutual labels: hadoop

Gis Tools For Hadoop

The GIS Tools for Hadoop are a collection of GIS tools for spatial analysis of big data.

Stars: ✭ 485 (-59.24%)

Mutual labels: hadoop

Interview Questions Collection

按知识领域整理面试题，包括C++、Java、Hadoop、机器学习等

Stars: ✭ 21 (-98.24%)

Mutual labels: hadoop

Pdf

编程电子书，电子书，编程书籍，包括C，C#，Docker，Elasticsearch，Git，Hadoop，HeadFirst，Java，Javascript，jvm，Kafka，Linux，Maven，MongoDB，MyBatis，MySQL，Netty，Nginx，Python，RabbitMQ，Redis，Scala，Solr，Spark，Spring，SpringBoot，SpringCloud，TCPIP，Tomcat，Zookeeper，人工智能，大数据类，并发编程，数据库类，数据挖掘，新面试题，架构设计，算法系列，计算机类，设计模式，软件测试，重构优化，等更多分类

Stars: ✭ 12,009 (+909.16%)

Mutual labels: hadoop

Hadoop Solr

Code to index HDFS to Solr using MapReduce

Stars: ✭ 51 (-95.71%)

Mutual labels: hadoop

God Of Bigdata

专注大数据学习面试，大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...

Stars: ✭ 6,008 (+404.87%)

Mutual labels: hadoop

Bigdata Interview

🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结

Stars: ✭ 857 (-27.98%)

Mutual labels: hadoop

Devops Python Tools

80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.

Stars: ✭ 406 (-65.88%)

Mutual labels: hadoop

Hive Funnel Udf

Hive UDFs for funnel analysis

Stars: ✭ 72 (-93.95%)

Mutual labels: hadoop

Big data architect skills

一个大数据架构师应该掌握的技能

Stars: ✭ 400 (-66.39%)

Mutual labels: hadoop

Hadoop Pot

A scalable Apache Hadoop-based implementation of the Pooled Time Series video similarity algorithm based on M. Ryoo et al paper CVPR 2015.

Stars: ✭ 8 (-99.33%)

Mutual labels: hadoop

Orc

Apache ORC - the smallest, fastest columnar storage for Hadoop workloads

Stars: ✭ 389 (-67.31%)

Mutual labels: hadoop

Moosefs

MooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)

Stars: ✭ 1,025 (-13.87%)

Mutual labels: hadoop

Bigdl

Building Large-Scale AI Applications for Distributed Big Data

Stars: ✭ 3,813 (+220.42%)

Mutual labels: hadoop

Kylo

Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.

Stars: ✭ 916 (-23.03%)

Mutual labels: hadoop

Wedatasphere

WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!

Stars: ✭ 372 (-68.74%)

Mutual labels: hadoop

Waimak

Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.

Stars: ✭ 60 (-94.96%)

Mutual labels: hadoop

Ozone

Scalable, redundant, and distributed object store for Apache Hadoop

Stars: ✭ 330 (-72.27%)

Mutual labels: hadoop

Szt Bigdata

深圳地铁大数据客流分析系统🚇🚄🌟

Stars: ✭ 826 (-30.59%)

Mutual labels: hadoop

Cascading

Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows locally or on a cluster. See https://github.com/Cascading/cascading for the release repository.

Stars: ✭ 318 (-73.28%)

Mutual labels: hadoop

Weblogsanalysissystem

A big data platform for analyzing web access logs

Stars: ✭ 37 (-96.89%)

Mutual labels: hadoop

Bigdataguide

大数据学习，从零开始学习大数据，包含大数据学习各阶段学习视频、面试资料

Stars: ✭ 817 (-31.34%)

Mutual labels: hadoop

Apache Spark Hands On

Educational notes,Hands on problems w/ solutions for hadoop ecosystem