flokkr / docker-hadoop

Licence: Apache-2.0 license

Docker image for main Apache Hadoop components (Yarn/Hdfs)

Programming Languages

shell

77523 projects

Dockerfile

14818 projects

31211 projects - #10 most used programming language

Projects that are alternatives of or similar to docker-hadoop

fastdata-cluster

Fast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)

Stars: ✭ 20 (-66.1%)

Mutual labels: yarn, hadoop, hdfs

Bigdata Interview

🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结

Stars: ✭ 857 (+1352.54%)

Mutual labels: yarn, hadoop, hdfs

wasp

WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.

Stars: ✭ 19 (-67.8%)

Mutual labels: yarn, hadoop, hdfs

Bigdata Notes

大数据入门指南 ⭐

Stars: ✭ 10,991 (+18528.81%)

Mutual labels: yarn, hadoop, hdfs

Bigdata docker

Big Data Ecosystem Docker

Stars: ✭ 161 (+172.88%)

Mutual labels: hadoop, hdfs

Spark With Python

Fundamentals of Spark with Python (using PySpark), code examples

Stars: ✭ 150 (+154.24%)

Mutual labels: hadoop, hdfs

beanszoo

Distributed Java micro-services using ZooKeeper

Stars: ✭ 12 (-79.66%)

Mutual labels: yarn, hadoop

Repository

个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。

Stars: ✭ 92 (+55.93%)

Mutual labels: hadoop, hdfs

yarn-prometheus-exporter

Export Hadoop YARN (resource-manager) metrics in prometheus format

Stars: ✭ 44 (-25.42%)

Mutual labels: yarn, hadoop

knit

Deprecated, please use https://github.com/jcrist/skein or https://github.com/dask/dask-yarn instead

Stars: ✭ 53 (-10.17%)

Mutual labels: yarn, hadoop

Tf Yarn

Train TensorFlow models on YARN in just a few lines of code!

Stars: ✭ 76 (+28.81%)

Mutual labels: yarn, hadoop

Dynamometer

A tool for scale and performance testing of HDFS with a specific focus on the NameNode.

Stars: ✭ 122 (+106.78%)

Mutual labels: hadoop, hdfs

Hdfs Shell

HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS

Stars: ✭ 117 (+98.31%)

Mutual labels: hadoop, hdfs

Ibis

A pandas-like deferred expression system, with first-class SQL support

Stars: ✭ 1,630 (+2662.71%)

Mutual labels: hadoop, hdfs

Jumbune

Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,

Stars: ✭ 64 (+8.47%)

Mutual labels: yarn, hadoop

Hadoop Yarn Api Python Client

Python client for Hadoop® YARN API

Stars: ✭ 91 (+54.24%)

Mutual labels: yarn, hadoop

Camus

Mirror of Linkedin's Camus

Stars: ✭ 81 (+37.29%)

Mutual labels: hadoop, hdfs

Wifi

基于wifi抓取信息的大数据查询分析系统

Stars: ✭ 93 (+57.63%)

Mutual labels: hadoop, hdfs

Akkeeper

An easy way to deploy your Akka services to a distributed environment.

Stars: ✭ 30 (-49.15%)

Mutual labels: yarn, hadoop

Tensorflowonyarn

Support TensorFlow on YARN

Stars: ✭ 114 (+93.22%)

Mutual labels: yarn, hadoop

View All Similar Projects ➔

Apache Hadoop docker images

These images are part of the Bigdata docker image series. All of the images use the same base docker image which contains plugin scripts to launch different project in containerized environments.

For more detailed instruction about the available environment variables see the README in the flokkr/docker-baseimage repository.

Docker images are tested with Kubernetes

Getting started with Kubernetes

The easiest way to start is to do a kubectl apply -f . from the ./exmaples directories (Using ephemeral storage!)

For more specific use case it's recommended to use flekszible. The resource definitions can be found in this repository (./hadoop,./hdfs,./yarn...)

Getting started with Flekszible

Install Flekszible (download binary and put it to the path)

Create a working dir

cd /tmp
mkdir cluster
cd cluster

Add this repository as a source

flekszible source add github.com/flokkr/docker-hadoop

Choose and add required services:

flekszible app add hdfs

Generate Kubernetes resource files

flekszible generate

Lunch the rockets:

kubectl apply -f .

Additional Flekszible options

You can list available apps (after source import):

flekszible app search
+---------+-------------------------------+
| path    | description                   |
+---------+-------------------------------+
| hdfs    | Apache Hadoop HDFS base setup |
| hdfs-ha | Apache Hadoop HDFS, HA setup  |
...

The base setup can be modified with additional transformatios:

flekszible definitions search | grep hdfs
...
| hdfs/persistence    | Add real PVC based persistence                                                             |
| hdfs/onenode        | remove scheduling rules to make it possible to run multiple datanode on the same k8s node. |
...

You can apply transformations with modifing the Flekszible descriptor file:

Original version:

source:
- url: github.com/flokkr/docker-hadoop
import:
- path: hdfs

Modified:

source:
- url: github.com/flokkr/docker-hadoop
import:
- path: hdfs
  transformations:
  - type: hdfs/onenode
  - type: image
    image: flokkr/hadoop:3.2.0

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

flokkr / docker-hadoop

Programming Languages

Labels

Projects that are alternatives of or similar to docker-hadoop

Apache Hadoop docker images

Getting started with Kubernetes

Getting started with Flekszible

Additional Flekszible options