All Projects → teamclairvoyant → hadoop-deployment-bash

teamclairvoyant / hadoop-deployment-bash

Licence: Apache-2.0 license
Code for the deployment of Hadoop clusters, written in Bourne or Bourne Again shell.

Programming Languages

shell
77523 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to hadoop-deployment-bash

CDH-Install-Manual
CDH安装手册
Stars: ✭ 70 (+125.81%)
Mutual labels:  hadoop, cloudera
phoenix
Apache Phoenix / Hbase Spring Boot Microservices
Stars: ✭ 23 (-25.81%)
Mutual labels:  hadoop, hortonworks
oci-cloudera
Terraform module to deploy Cloudera on Oracle Cloud Infrastructure (OCI)
Stars: ✭ 20 (-35.48%)
Mutual labels:  hadoop, cloudera
sparkucx
A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer
Stars: ✭ 32 (+3.23%)
Mutual labels:  hadoop
hadoop-etl-udfs
The Hadoop ETL UDFs are the main way to load data from Hadoop into EXASOL
Stars: ✭ 17 (-45.16%)
Mutual labels:  hadoop
hive-jdbc-driver
An alternative to the "hive standalone" jar for connecting Java applications to Apache Hive via JDBC
Stars: ✭ 31 (+0%)
Mutual labels:  hadoop
MLHadoop
This repository contains Machine-Learning MapReduce codes for Hadoop which are written from scratch (without using any package or library). E.g. Prediction (Linear and Logistic Regression), Clustering (K-Means), Classification (KNN) etc.
Stars: ✭ 50 (+61.29%)
Mutual labels:  hadoop
rastercube
rastercube is a python library for big data analysis of georeferenced time series data (e.g. MODIS NDVI)
Stars: ✭ 15 (-51.61%)
Mutual labels:  hadoop
Movies-Analytics-in-Spark-and-Scala
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Stars: ✭ 47 (+51.61%)
Mutual labels:  hadoop
hadoop-crypto
Library for per-file client-side encyption in Hadoop FileSystems such as HDFS or S3.
Stars: ✭ 38 (+22.58%)
Mutual labels:  hadoop
datasqueeze
Hadoop utility to compact small files
Stars: ✭ 18 (-41.94%)
Mutual labels:  hadoop
liquibase-impala
Liquibase extension to add Impala Database support
Stars: ✭ 23 (-25.81%)
Mutual labels:  hadoop
UBA
UEBA Solution for Insider Security. This repo is archived. Thanks!
Stars: ✭ 36 (+16.13%)
Mutual labels:  hadoop
memex-gate
General Architecture for Text Engineering
Stars: ✭ 47 (+51.61%)
Mutual labels:  hadoop
DaFlow
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (-22.58%)
Mutual labels:  hadoop
hadoopoffice
HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
Stars: ✭ 56 (+80.65%)
Mutual labels:  hadoop
aaocp
一个对用户行为日志进行分析的大数据项目
Stars: ✭ 53 (+70.97%)
Mutual labels:  hadoop
wasp
WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.
Stars: ✭ 19 (-38.71%)
Mutual labels:  hadoop
presto
Teradata Distribution of Presto -- A Distributed SQL Query Engine for Big Data
Stars: ✭ 91 (+193.55%)
Mutual labels:  hadoop
implyr
SQL backend to dplyr for Impala
Stars: ✭ 74 (+138.71%)
Mutual labels:  hadoop

hadoop-deployment-bash

Cloudera

These are shell scripts to deploy Cloudera Manager and related Cloudera encryption products to a cluster. The goal of these scripts are to be idempotent and to serve as a template for translation into other Configuration Management frameworks/languages.

  • Works with RHEL/CentOS 6 or 7 x86_64.
  • Works with Ubuntu Trusty 14.04 x86_64.
  • Allows for installation of Oracle JDK 7 from Cloudera, Oracle JDK 8 from Oracle, or OpenJDK 7 or 8.

This is an example of some of the functionality. Not everything is documented. Some scripts have arguments that can be passed to them to change their internal operation. Read the source to learn more.

Prep

This is needed for both the Evaluation and Example sections below.

Set the GITREPO variable to the local directory where you have cloned this repository and create a file with SSH login and hostname.

GITREPO=~/git/teamclairvoyant/bash

cat <<EOF >HOSTLIST
[email protected]
[email protected]
[email protected]
EOF

Evaluation

Run the evaluation script to gather the configuration of all the nodes of the cluster. Save the output in the directory "evaluate-pre".

mkdir evaluate-pre
for HOST in `cat HOSTLIST`; do
  echo "*** $HOST"
  scp -p ${GITREPO}/evaluate.sh ${HOST}:
  ssh -qt $HOST './evaluate.sh' >evaluate-pre/${HOST}.out 2>evaluate-pre/${HOST}.err
done

Example

Copy several of the scripts to the nodes.

for HOST in `cat HOSTLIST`; do
  echo "*** $HOST"
  scp -p \
  ${GITREPO}/install_tools.sh \
  ${GITREPO}/change_swappiness.sh \
  ${GITREPO}/disable_iptables.sh \
  ${GITREPO}/disable_ipv6.sh \
  ${GITREPO}/disable_selinux.sh \
  ${GITREPO}/disable_thp.sh \
  ${GITREPO}/install_chrony.sh \
  ${GITREPO}/install_nscd.sh \
  ${GITREPO}/install_jdk.sh \
  ${GITREPO}/configure_javahome.sh \
  ${GITREPO}/install_jce.sh \
  ${GITREPO}/configure_jdk_krbref.sh \
  ${GITREPO}/install_krb5.sh \
  ${GITREPO}/configure_tuned.sh \
  ${GITREPO}/install_entropy.sh \
  ${GITREPO}/install_jdbc.sh \
  ${GITREPO}/install_jdbc_sqoop.sh \
  ${GITREPO}/install_clouderamanageragent.sh \
  $HOST:
done

Run the scripts to prep the system for Cloudera Manager installation. Pin the version of Cloudera Manager to the value in $CMVER. Also deploy Oracle JDK 8.

#BOPT="-x"    # Turn on bash debugging.
CMVER=6.3.2   # Set specific Cloudera Manager version.
for HOST in `cat HOSTLIST`; do
  echo "*** $HOST"
  ssh -t $HOST " \
  sudo bash $BOPT ./install_tools.sh; \
  sudo bash $BOPT ./change_swappiness.sh; \
  sudo bash $BOPT ./disable_iptables.sh; \
  sudo bash $BOPT ./disable_ipv6.sh; \
  sudo bash $BOPT ./disable_selinux.sh; \
  sudo bash $BOPT ./disable_thp.sh; \
  sudo bash $BOPT ./install_chrony.sh; \
  sudo bash $BOPT ./install_nscd.sh; \
  sudo bash $BOPT ./install_jdk.sh --jdktype openjdk --jdkversion 8; \
  sudo bash $BOPT ./configure_javahome.sh; \
  sudo bash $BOPT ./install_jce.sh; \
  sudo bash $BOPT ./configure_jdk_krbref.sh; \
  sudo bash $BOPT ./install_krb5.sh; \
  sudo bash $BOPT ./configure_tuned.sh; \
  sudo bash $BOPT ./install_entropy.sh"
done

Install the Cloudera Manager agent.

CMSERVER=ip-10-2-5-22.ec2.internal
for HOST in `cat HOSTLIST`; do
  echo "*** $HOST"
  ssh -t $HOST "sudo bash $BOPT ./install_clouderamanageragent.sh -H $CMSERVER -V $CMVER"
done

Install the Cloudera Manager server with the embedded PostgreSQL database.

scp -p ${GITREPO}/install_clouderamanagerserver.sh ${CMSERVER}:
ssh -t ${CMSERVER} "sudo bash $BOPT ./install_clouderamanagerserver.sh -d embedded -V $CMVER"

You can use the argument embedded, postgresql, mysql, or oracle.

Post Evaluation

Run the evaluation script again to gather the new configuration of all the nodes of the cluster. Save the output in the directory "evaluate-post".

mkdir evaluate-post
for HOST in `cat HOSTLIST`; do
  echo "*** $HOST"
  scp -p ${GITREPO}/evaluate.sh ${HOST}:
  ssh -qt $HOST './evaluate.sh' >evaluate-post/${HOST}.out 2>evaluate-pre/${HOST}.err
done

Hortonworks

These are shell scripts to deploy Hortonworks Ambari to a cluster. The goal of these scripts are to be idempotent and to serve as a template for translation into other Configuration Management frameworks/languages.

  • Works with RHEL/CentOS 6 or 7 x86_64.
  • Works with Ubuntu Trusty 14.04 x86_64.
  • Allows for installation of Oracle JDK 8 from Oracle or OpenJDK 7 or 8.

This is an example of some of the functionality. Not everything is documented. Some scripts have arguments that can be passed to them to change their internal operation. Read the source to learn more.

Prep

This is needed for both the Evaluation and Example sections below.

Set the GITREPO variable to the local directory where you have cloned this repository and create a file with SSH login and hostname.

GITREPO=~/git/teamclairvoyant/bash

cat <<EOF >HOSTLIST
[email protected]
[email protected]
[email protected]
EOF

Evaluation

Run the evaluation script to gather the configuration of all the nodes of the cluster. Save the output in the directory "evaluate-pre".

mkdir evaluate-pre
for HOST in `cat HOSTLIST`; do
  echo "*** $HOST"
  scp -p ${GITREPO}/evaluate.sh ${HOST}:
  ssh -qt $HOST './evaluate.sh' >evaluate-pre/${HOST}.out 2>evaluate-pre/${HOST}.err
done

Example

Copy several of the scripts to the nodes.

for HOST in `cat HOSTLIST`; do
  echo "*** $HOST"
  scp -p \
  ${GITREPO}/install_tools.sh \
  ${GITREPO}/change_swappiness.sh \
  ${GITREPO}/disable_iptables.sh \
  ${GITREPO}/disable_ipv6.sh \
  ${GITREPO}/disable_selinux.sh \
  ${GITREPO}/disable_thp.sh \
  ${GITREPO}/install_chrony.sh \
  ${GITREPO}/install_nscd.sh \
  ${GITREPO}/install_jdk.sh \
  ${GITREPO}/configure_javahome.sh \
  ${GITREPO}/install_jce.sh \
  ${GITREPO}/configure_jdk_krbref.sh \
  ${GITREPO}/install_krb5.sh \
  ${GITREPO}/configure_tuned.sh \
  ${GITREPO}/install_entropy.sh \
  ${GITREPO}/install_jdbc.sh \
  ${GITREPO}/install_jdbc_sqoop.sh \
  ${GITREPO}/install_hortonworksambariagent.sh \
  $HOST:
done

Run the scripts to prep the system for Hortonworks Ambari installation. Pin the version of Hortonworks Ambari to the value in $HAVER. Also deploy OpenJDK 8.

#BOPT="-x"    # Turn on bash debugging.
HAVER=2.5.2.0 # Set specific Hortonworks Ambari version
for HOST in `cat HOSTLIST`; do
  echo "*** $HOST"
  ssh -t $HOST " \
  sudo bash $BOPT ./install_tools.sh; \
  sudo bash $BOPT ./change_swappiness.sh; \
  sudo bash $BOPT ./disable_iptables.sh; \
  sudo bash $BOPT ./disable_ipv6.sh; \
  sudo bash $BOPT ./disable_selinux.sh; \
  sudo bash $BOPT ./disable_thp.sh; \
  sudo bash $BOPT ./install_chrony.sh; \
  sudo bash $BOPT ./install_nscd.sh; \
  sudo bash $BOPT ./install_jdk.sh --jdktype openjdk --jdkversion 8; \
  sudo bash $BOPT ./configure_javahome.sh; \
  sudo bash $BOPT ./install_jce.sh; \
  sudo bash $BOPT ./configure_jdk_krbref.sh; \
  sudo bash $BOPT ./install_krb5.sh; \
  sudo bash $BOPT ./configure_tuned.sh; \
  sudo bash $BOPT ./install_entropy.sh"
done

Install the Hortonworks Ambari agent.

HASERVER=ip-10-2-5-22.ec2.internal
for HOST in `cat HOSTLIST`; do
  echo "*** $HOST"
  ssh -t $HOST "sudo bash $BOPT ./install_hortonworksambariagent.sh $HASERVER $HAVER"
done

Install the Hortonworks Ambari server with the embedded PostgreSQL database.

scp -p ${GITREPO}/install_hortonworksambariserver.sh ${HASERVER}:
ssh -t ${HASERVER} "sudo bash $BOPT ./install_hortonworksambariserver.sh embedded $HAVER"

You can use the argument embedded, postgresql, mysql, or oracle.

Post Evaluation

Run the evaluation script again to gather the new configuration of all the nodes of the cluster. Save the output in the directory "evaluate-post".

mkdir evaluate-post
for HOST in `cat HOSTLIST`; do
  echo "*** $HOST"
  scp -p ${GITREPO}/evaluate.sh ${HOST}:
  ssh -qt $HOST './evaluate.sh' >evaluate-post/${HOST}.out 2>evaluate-pre/${HOST}.err
done

Contributing to this project

Everyone is welcome to contribute. Please take a moment to review the guidelines for contributing.

License

Copyright (C) 2015 Clairvoyant, LLC.

Licensed under the Apache License, Version 2.0.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].