All Projects → trifacta → Floating Elephants

trifacta / Floating Elephants

Licence: apache-2.0
Docker containers for Hadoop.

Programming Languages

shell
77523 projects

Projects that are alternatives of or similar to Floating Elephants

Winutils
winutils.exe hadoop.dll and hdfs.dll binaries for hadoop windows
Stars: ✭ 657 (+3357.89%)
Mutual labels:  hadoop
Aliyungo
Go SDK for Aliyun (Alibaba Cloud) - Golang API for ECS, OSS, DNS, SLB, RDS, RAM, MNS, STS, SLS, MQ, Push, OpenSearch, DM, Container Service etc.
Stars: ✭ 756 (+3878.95%)
Mutual labels:  dns
Bigdataguide
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (+4200%)
Mutual labels:  hadoop
Landrush
A Vagrant plugin that provides a simple DNS server for Vagrant guests
Stars: ✭ 665 (+3400%)
Mutual labels:  dns
Nsupdate.info
Dynamic DNS service
Stars: ✭ 720 (+3689.47%)
Mutual labels:  dns
Ftl
The Pi-hole FTL engine
Stars: ✭ 776 (+3984.21%)
Mutual labels:  dns
Dnsdiag
DNS Measurement, Troubleshooting and Security Auditing Toolset
Stars: ✭ 650 (+3321.05%)
Mutual labels:  dns
Szt Bigdata
深圳地铁大数据客流分析系统🚇🚄🌟
Stars: ✭ 826 (+4247.37%)
Mutual labels:  hadoop
Hblock
Improve your security and privacy by blocking ads, tracking and malware domains.
Stars: ✭ 724 (+3710.53%)
Mutual labels:  dns
Dank Selfhosted
Automated solution for hosting email, web, DNS, XMPP, and ZNC on OpenBSD.
Stars: ✭ 800 (+4110.53%)
Mutual labels:  dns
Dns Over Wikipedia
Redirect `.idk` domains using the official link found on a topic's Wikipedia page
Stars: ✭ 669 (+3421.05%)
Mutual labels:  dns
Dnsfs
Store your data in others DNS revolvers cache
Stars: ✭ 696 (+3563.16%)
Mutual labels:  dns
Hostess
An idempotent command-line utility for managing your /etc/hosts file.
Stars: ✭ 779 (+4000%)
Mutual labels:  dns
Awesome Checker Services
✅ List of links to the various checkers out there on the web for sites, domains, security etc.
Stars: ✭ 662 (+3384.21%)
Mutual labels:  dns
Traypingapp
📡 OSX tray application showing DNS and ping latency
Stars: ✭ 5 (-73.68%)
Mutual labels:  dns
Sdns
Privacy important, fast, recursive dns resolver server with dnssec support
Stars: ✭ 658 (+3363.16%)
Mutual labels:  dns
Docker Bind
Dockerize BIND DNS server with webmin for DNS administration
Stars: ✭ 769 (+3947.37%)
Mutual labels:  dns
Whour
Tool for information gathering, IPReverse, AdminFInder, DNS, WHOIS, SQLi Scanner with google.
Stars: ✭ 18 (-5.26%)
Mutual labels:  dns
Hadoop For Geoevent
ArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.
Stars: ✭ 5 (-73.68%)
Mutual labels:  hadoop
Godns
A dynamic DNS client tool supports AliDNS, Cloudflare, Google Domains, DNSPod, HE.net & DuckDNS & DreamHost, etc, written in Go.
Stars: ✭ 784 (+4026.32%)
Mutual labels:  dns

Floating Elephants

Docker containers for Hadoop.

Elephant held by coloured balloons

An easy way to reproduce a multi-node Hadoop cluster on a local machine.

Requirements

Getting Started

Pick one of the available Hadoop distributions. For example,

cd cloudera/cdh5

Build the images for the distribution:

docker-compose build

Create a Docker network:

docker network create -d bridge \
  --subnet=172.20.0.0/16 --gateway 172.20.0.1 --ip-range=172.20.0.0/16 \
  cdh5-lagoon

Start the containers:

docker-compose up -d --no-recreate

Networking

Hadoop services typically use DNS to connect to each other. Docker's inbuilt networking features are set up for the services to talk to each other. For example, to create the hdp2-lagoon network run

docker network create -d bridge \
  --subnet=172.21.0.0/16 --gateway 172.21.0.1 --ip-range=172.21.0.0/16 \
  hdp2-lagoon

We could use docker-compose to create networks automatically in the future. Currently the tool will generate domain names with an underscore character, which form invalid URIs.

The hostnames are pre-configured in the Hadoop XML configuration files in conf.docker_cluster/*.xml and docker-compose.yml. All of these hostnames end with .cdh5-lagoon or .hdp2-lagoon.

Another small container running dnsmasq that forwards port 53 acts as the DNS for the host.

To connect to the containers from the host machine using these hostnames, you must add DNS and routing table entries to your host.

OS X

We use the resolver(5) mechanism built into OS X to resolve DNS addresses correctly via the /etc/resolver directory which you may need to create.

The following instructions assume that you are using the cloudera distro. Replace cdh5-lagoon with hdp2-lagoon if you are using the hortonworks distro.

If you're using docker-machine,

export DOCKER_HOST_IP=$(docker-machine ip $DOCKER_MACHINE_NAME)

sudo mkdir /etc/resolver
echo "nameserver $DOCKER_HOST_IP" | sudo tee /etc/resolver/cdh5-lagoon
sudo route -n add -net 172.20.0.0 $DOCKER_HOST_IP

If you're using boot2docker:

export DOCKER_HOST_IP=$(boot2docker ip)

To remove these settings at a later point, run the following:

sudo rm /etc/resolver/cdh5-lagoon
sudo route -n delete 172.20.0.0

Verify your cluster is running

Visit the Web UIs for the services:

Service Web UI URL
HDFS Namenode http://hdfsnamenode.cdh5-lagoon:50070/
YARN Resource Manager http://yarnresourcemanager.cdh5-lagoon:8088/
MapReduce History Server http://mapreducehistory.cdh5-lagoon:19888/

Multiple worker nodes

You can scale the number of "clusternodes", which are nodes that run an HDFS Datanode and a YARN Node Manager. For example, to run 5 clusternodes:

docker-compose scale clusternode=5

Supported Hadoop Distributions

Vendor Distro Directory
Cloudera CDH 5 cloudera/cdh5
Hortonworks HDP 2 hortonworks/hdp2

Roadmap

In no particular order:

  • Kerberos
  • High Availability
  • Hive + HCatalog
  • Spark

Contributing

We welcome pull requests! Borrowing from the docker project's guide:

Your signature certifies that you wrote the patch or otherwise have the right to pass it on as an open-source patch. If you can certify the below (from http://developercertificate.org):

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
660 York Street, Suite 102,
San Francisco, CA 94110 USA

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.

Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

Then you just add a line to every git commit message:

Signed-off-by: Joe Smith <[email protected]>

If you set your user.name and user.email git configs, you can sign your commit automatically with git commit -s.

People

  • Seshadri Mahalingam, primary contributor
  • Jeremy Mailen, architectural & ideological support
  • Alexander Vaughn, the awesome project logo
  • Vihang Mehta, various contributions
  • Many more Trifactans who tried it out and contributed feedback & moral support

Developed by Trifacta

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].