All Projects → bbonnin → docker-hadoop-3

bbonnin / docker-hadoop-3

Licence: MIT license
Docker file for Hadoop 3

Programming Languages

shell
77523 projects

Projects that are alternatives of or similar to docker-hadoop-3

presto
Teradata Distribution of Presto -- A Distributed SQL Query Engine for Big Data
Stars: ✭ 91 (+378.95%)
Mutual labels:  hadoop
aaocp
一个对用户行为日志进行分析的大数据项目
Stars: ✭ 53 (+178.95%)
Mutual labels:  hadoop
hadoop-deployment-bash
Code for the deployment of Hadoop clusters, written in Bourne or Bourne Again shell.
Stars: ✭ 31 (+63.16%)
Mutual labels:  hadoop
datasqueeze
Hadoop utility to compact small files
Stars: ✭ 18 (-5.26%)
Mutual labels:  hadoop
UBA
UEBA Solution for Insider Security. This repo is archived. Thanks!
Stars: ✭ 36 (+89.47%)
Mutual labels:  hadoop
DaFlow
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (+26.32%)
Mutual labels:  hadoop
liquibase-impala
Liquibase extension to add Impala Database support
Stars: ✭ 23 (+21.05%)
Mutual labels:  hadoop
clusterdock
clusterdock is a framework for creating Docker-based container clusters
Stars: ✭ 26 (+36.84%)
Mutual labels:  hadoop
darwin
Avro Schema Evolution made easy
Stars: ✭ 26 (+36.84%)
Mutual labels:  hadoop
web-click-flow
网站点击流离线日志分析
Stars: ✭ 14 (-26.32%)
Mutual labels:  hadoop
hadoop-crypto
Library for per-file client-side encyption in Hadoop FileSystems such as HDFS or S3.
Stars: ✭ 38 (+100%)
Mutual labels:  hadoop
hive-jdbc-driver
An alternative to the "hive standalone" jar for connecting Java applications to Apache Hive via JDBC
Stars: ✭ 31 (+63.16%)
Mutual labels:  hadoop
clickhouse hadoop
Import data from clickhouse to hadoop with pure SQL
Stars: ✭ 26 (+36.84%)
Mutual labels:  hadoop
wasp
WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.
Stars: ✭ 19 (+0%)
Mutual labels:  hadoop
fsbrowser
Fast desktop client for Hadoop Distributed File System
Stars: ✭ 27 (+42.11%)
Mutual labels:  hadoop
hadoop-ecosystem
Visualizations of the Hadoop Ecosystem
Stars: ✭ 20 (+5.26%)
Mutual labels:  hadoop
Movies-Analytics-in-Spark-and-Scala
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Stars: ✭ 47 (+147.37%)
Mutual labels:  hadoop
cobra-policytool
Manage Apache Atlas and Ranger configuration for your Hadoop environment.
Stars: ✭ 16 (-15.79%)
Mutual labels:  hadoop
flokkr
Documentation placeholder and utilities for all the other containers.
Stars: ✭ 30 (+57.89%)
Mutual labels:  hadoop
MLHadoop
This repository contains Machine-Learning MapReduce codes for Hadoop which are written from scratch (without using any package or library). E.g. Prediction (Linear and Logistic Regression), Clustering (K-Means), Classification (KNN) etc.
Stars: ✭ 50 (+163.16%)
Mutual labels:  hadoop

Docker file for Hadoop 3

Most of the work is coming from : http://bigdatums.net/2017/11/04/creating-hadoop-docker-image/

Just added a few adaptations for Hadoop 3.

For some details about Hadoop 3 (such as new ports), see: https://fr.slideshare.net/HadoopSummit/hadoop-3-in-a-nutshell

Please, read the content of Dockerfile, because it may be possible that you have to update it. See the comments about the tgz of hadoop3.

After starting the container, you can access the web UI:

Warning: hue is not fully functional... Its integration is a work in progess (file browsing is ok) !

How-to

  • Build the image
sudo docker build -t hadoop3 .
  • Run the container
sudo docker run --hostname=hadoop3 -p 8088:8088 -p 9870:9870 -p 9864:9864 -p 19888:19888 \
  -p 8042:8042 -p 8888:8888 --name hadoop3 -d hadoop3
  • Access the container
sudo docker exec -it hadoop3 bash
  • Test a job
yarn jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0.jar pi 10 100
  • Clean
sudo docker stop hadoop3 
sudo docker rm hadoop3 

Next steps

Product/Framework/Env. Version (R)equired/((O)ptional
Hue 4.1 R
Hive 2.3.2 R
Minifi ? O
Druid ? O
Kafka ? O
Storm ? O
Spark 2.2.0 O
Ambari 2.6.1 O
Ambari-metrics 2.6.1 O
HBase ? O

Some notes

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].