All Projects → oracle-quickstart → oci-cloudera

oracle-quickstart / oci-cloudera

Licence: Apache-2.0 license
Terraform module to deploy Cloudera on Oracle Cloud Infrastructure (OCI)

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects
HCL
1544 projects

Projects that are alternatives of or similar to oci-cloudera

CDH-Install-Manual
CDH安装手册
Stars: ✭ 70 (+250%)
Mutual labels:  hadoop, cloudera, cdh
Hybrid multicloud overlay
MutiCloud_Overlay demonstrates a use case of overlay over one or more clouds such as AWS, Azure, GCP, OCI, Alibaba and a vSphere private infrastructure in Hub and spoke topology, point to point topology and in a Single cloud. Overlay protocols IPv6 and IPv4 are independent of underlying infrastructure. This solution can be integrated with encryption and additional security features.
Stars: ✭ 127 (+535%)
Mutual labels:  oracle, oci
jmx exporter-cloudera-hadoop
Prometheus jmx_exporter configurations for Cloudera Hadoop
Stars: ✭ 33 (+65%)
Mutual labels:  hadoop, cdh
cmux
A set of commands for managing CDH clusters using Cloudera Manager REST API.
Stars: ✭ 34 (+70%)
Mutual labels:  hadoop, cdh
Terraform Provider Oci
Terraform Oracle Cloud Infrastructure provider
Stars: ✭ 400 (+1900%)
Mutual labels:  oracle, oci
Terraform Oci Oke
The Terraform OKE Module Installer for Oracle Cloud Infrastructure provides a Terraform module that provisions the necessary resources for Oracle Container Engine.
Stars: ✭ 57 (+185%)
Mutual labels:  oracle, oci
hadoop-deployment-bash
Code for the deployment of Hadoop clusters, written in Bourne or Bourne Again shell.
Stars: ✭ 31 (+55%)
Mutual labels:  hadoop, cloudera
Ocilib
OCILIB (C and C++ Drivers for Oracle) - Open source C and C++ library for accessing Oracle databases
Stars: ✭ 245 (+1125%)
Mutual labels:  oracle, oci
Addax
Addax is an open source universal ETL tool that supports most of those RDBMS and NoSQLs on the planet, helping you transfer data from any one place to another.
Stars: ✭ 615 (+2975%)
Mutual labels:  hadoop, oracle
Base
https://www.researchgate.net/profile/Rajah_Iyer
Stars: ✭ 48 (+140%)
Mutual labels:  hadoop, oracle
oci-quickstart
Oracle Cloud Infrastructure Quick Start
Stars: ✭ 59 (+195%)
Mutual labels:  oracle, oci
OCI-Rest-APIs-nodejs
Oracle Cloud Infrastructure REST APIs implemented in node.js, with current support for Database and limited Object Storage. More will be added.
Stars: ✭ 18 (-10%)
Mutual labels:  oracle, oci
Cloud-Service-Providers-Free-Tier-Overview
Comparing the free tier offers of the major cloud providers like AWS, Azure, GCP, Oracle etc.
Stars: ✭ 226 (+1030%)
Mutual labels:  oracle, oci
Cloudsploit
Cloud Security Posture Management (CSPM)
Stars: ✭ 1,338 (+6590%)
Mutual labels:  oracle, oci
terraform-oci-vcn
A reusable and extensible Terraform module that provisions a VCN on Oracle Cloud Infrastructure
Stars: ✭ 22 (+10%)
Mutual labels:  oracle, oci
terraform-oci-compute-instance
Terraform Module for creating Oracle Cloud Infrastructure compute instances
Stars: ✭ 29 (+45%)
Mutual labels:  oracle, oci
Datax
DataX is an open source universal ETL tool that support Cassandra, ClickHouse, DBF, Hive, InfluxDB, Kudu, MySQL, Oracle, Presto(Trino), PostgreSQL, SQL Server
Stars: ✭ 116 (+480%)
Mutual labels:  hadoop, oracle
Valheim-Free-Game-Server-Setup-Using-Oracle-Cloud
Valheim Oracle Cloud Server Setup
Stars: ✭ 24 (+20%)
Mutual labels:  oracle, oci
pyspark-ML-in-Colab
Pyspark in Google Colab: A simple machine learning (Linear Regression) model
Stars: ✭ 32 (+60%)
Mutual labels:  hadoop
xxhadoop
Data Analysis Using Hadoop/Spark/Storm/ElasticSearch/MachineLearning etc. This is My Daily Notes/Code/Demo. Don't fork, Just star !
Stars: ✭ 37 (+85%)
Mutual labels:  hadoop


Cloudera on Oracle Cloud Infrastructure


cloudera-stack

This is a Terraform module that deploys Cloudera Data Platform (CDP) Data Center on Oracle Cloud Infrastructure (OCI). It is developed jointly by Oracle and Cloudera.

Deployment Information

The following table shows Recommended and Minimum supported OCI shapes for each cluster role:

Worker Nodes Bastion Instance Utility and Master Instances
Recommended BM.DenseIO2.52 VM.Standard2.4 VM.Standard2.16
Minimum VM.Standard2.8 VM.Standard2.1 VM.Standard2.8

Resource Manager Deployment

This Quick Start uses OCI Resource Manager to make deployment quite easy.

Simply click this button to deploy to OCI.

Deploy to Oracle Cloud

This template uses Terraform v0.12, and has support to target existing VCN/Subnets for cluster deployment. To engage this functionality, just use the Schema menu system to select an existing VCN target, then select appropriate Subnets for each cluster host type.

If you deploy Cloudera Manager to a private subnet, you will require a VPN or SSH Tunnel through an edge node to access cluster management.

Once the deployment is complete you can access Cloudera manager at http://<some IP address>:7180/cmf/login.

Cluster Provisioning is executed on the Utility host using CloudInit. That activity is logged in /var/log/cloudera-OCI-initialize.log. This log file can be used to triage cluster setup issues.

The default username is cm_admin and the default password is changeme. You should see a cluster up and running like this:

If upon login you are presenetd with a licensing prompt, please wait, do not interact, and allow additional time for the automated cluster provisioning process to complete. Refresh the page after a few minutes to check on deployment.

Python Deployment using cm_client

The deployment script deploy_on_oci.py uses cm_client against Cloudera Manager API v31. This script can be customized before execution. Reference the header section in the script, the following parameters are passed at deployment time either manually or via ORM schema:

	admin_user_name
	admin_password

When using ORM schema, these values are put into Utility instance metadata. It is highly encouraged to modify the admin password in Cloudera Manager after deployment is complete.

In addition, advanced customization of the cluster deployment can be done by modification of the following functions:

	setup_mgmt_rcg
	update_cluster_rcg_configuration

This requires some knowledge of Python and Cloudera configuration - modify at your own risk. These functions contain Cloudera specific tuning parameters as well as host mapping for roles.

Kerberos Secure Cluster Option

This automation supports using a local KDC deployed on the Cloudera Manager instance for secure cluster operation. Please read the scripts README for information regarding how to set these parameters prior to deployment if desired. This can be toggled during ORM stack setup using the schema.

Also - for cluster management using Kerberos, you will need to manually create at a minimum the HDFS Superuser Principal as detailed here after deployment.

High Availability

High Availability for HDFS services is also offered as part of the deployment process. This can be toggled during ORM stack setup using the Schema.

Metadata and MySQL

You can customize the default root password for MySQL by editing the source script cms_mysql.sh. For the various Cloudera databases, random passwords are generated and used. These are stored in a flat file on the Utility host for use at deployment time. This file should be removed after you notate/change the pre-generated passwords, it is located here on the Utility node: /etc/mysql/mysql.pw

Object Storage Integration

Object Storage can also be leveraged by setting S3 compatability paramaters in the Python deployment script. Details can be found in the header section. You will need to setup the appropriate S3 compatability pre-requisites as detailed here for this to work.

Architecture Diagram

Here is a diagram showing what is typically deployed using this template. Note that resources are automatically distributed among Fault Domains in an Availability Domain to ensure fault tolerance. Additional workers deployed will stripe between the 3 fault domains in sequence starting with the Fault Domain 1 and incrementing sequentially.

Deployment Architecture Diagram

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].