All Projects → IBM → Scalable Cassandra Deployment On Kubernetes

IBM / Scalable Cassandra Deployment On Kubernetes

Licence: apache-2.0
In this code we provide a full roadmap the deployment of a multi-node scalable Cassandra cluster on Kubernetes. Cassandra understands that it is running within a cluster manager, and uses this cluster management infrastructure to help implement the application. Kubernetes concepts like Replication Controller, StatefulSets etc. are leveraged to deploy either non-persistent or persistent Cassandra clusters on Kubernetes cluster.

Programming Languages

shell
77523 projects

Projects that are alternatives of or similar to Scalable Cassandra Deployment On Kubernetes

Kubernetes Container Service Gitlab Sample
This code shows how a common multi-component GitLab can be deployed on Kubernetes cluster. Each component (NGINX, Ruby on Rails, Redis, PostgreSQL, and more) runs in a separate container or group of containers.
Stars: ✭ 240 (+30.43%)
Mutual labels:  bluemix, ibmcode, containers
Gameon Java Microservices On Kubernetes
This code demonstrates deployment of a Microservices based application Game On! on to Kubernetes cluster. Game On! is a throwback text-based adventure built to help you explore microservice architectures and related concepts.
Stars: ✭ 88 (-52.17%)
Mutual labels:  bluemix, ibmcode, containers
Scalable Wordpress Deployment On Kubernetes
This code showcases the full power of Kubernetes clusters and shows how can we deploy the world's most popular website framework on top of world's most popular container orchestration platform.
Stars: ✭ 173 (-5.98%)
Mutual labels:  bluemix, ibmcode, containers
node-red-dsx-workflow
This journey helps to build a complete end-to-end analytics solution using IBM Watson Studio. This repository contains instructions to create a custom web interface to trigger the execution of Python code in Jupyter Notebook and visualise the response from Jupyter Notebook on IBM Watson Studio.
Stars: ✭ 26 (-85.87%)
Mutual labels:  bluemix, ibmcode
banking-digitalization-using-hybrid-cloud-with-mainframes
The following journey will introduce the available Banking APIs published on IBM Cloud with logical business programs running on the IBM Z Mainframe through a simulated retail bank called MPLbank.
Stars: ✭ 21 (-88.59%)
Mutual labels:  bluemix, ibmcode
watson-multimedia-analyzer
WARNING: This repository is no longer maintained ⚠️ This repository will not be updated. The repository will be kept available in read-only mode. A Node app that use Watson Visual Recognition, Speech to Text, Natural Language Understanding, and Tone Analyzer to enrich media files.
Stars: ✭ 23 (-87.5%)
Mutual labels:  bluemix, ibmcode
visualize-data-with-python
A Jupyter notebook using some standard techniques for data science and data engineering to analyze data for the 2017 flooding in Houston, TX.
Stars: ✭ 60 (-67.39%)
Mutual labels:  bluemix, ibmcode
detect-timeseriesdata-change
WARNING: This repository is no longer maintained ⚠️ This repository will not be updated. The repository will be kept available in read-only mode.
Stars: ✭ 21 (-88.59%)
Mutual labels:  bluemix, ibmcode
ibm-cloud-functions-serverless-ocr-openchecks
Serverless bank check deposit processing with object storage and optical character recognition using Apache OpenWhisk powered by IBM Cloud Functions. See the Tech Talk replay for a demo.
Stars: ✭ 40 (-78.26%)
Mutual labels:  bluemix, ibmcode
Drupal Nginx Php Kubernetes
Demonstration of a set of NGINX and PHP-FPM containers running Drupal deployed to Kubernetes on the IBM Container Service. This is a work in progress.
Stars: ✭ 43 (-76.63%)
Mutual labels:  bluemix, containers
Serverless Home Automation
WARNING: This repository is no longer maintained ⚠️ This repository will not be updated. The repository will be kept available in read-only mode.
Stars: ✭ 38 (-79.35%)
Mutual labels:  bluemix, ibmcode
Cognitive Social Crm
An application that monitors a Twitter feed and determines customer sentiment using IBM Watson Assistant, Tone Analyzer, Natural Language Understanding, as well as CloudantDB
Stars: ✭ 71 (-61.41%)
Mutual labels:  bluemix, ibmcode
Spring Boot Microservices On Kubernetes
In this code we demonstrate how a simple Spring Boot application can be deployed on top of Kubernetes. This application, Office Space, mimicks the fictitious app idea from Michael Bolton in the movie "Office Space".
Stars: ✭ 504 (+173.91%)
Mutual labels:  ibmcode, containers
Cognitiveconcierge
WARNING: This repository is no longer maintained ⚠️ This repository will not be updated. The repository will be kept available in read-only mode.
Stars: ✭ 100 (-45.65%)
Mutual labels:  bluemix, ibmcode
Choerodon
Open Source Multi-Cloud Integrated Platform
Stars: ✭ 2,149 (+1067.93%)
Mutual labels:  containers
Runtime
Kata Containers version 1.x runtime (for version 2.x see https://github.com/kata-containers/kata-containers).
Stars: ✭ 2,103 (+1042.93%)
Mutual labels:  containers
Eshoponazure
Azure version of the eShopOnContainers, with implementations based on Azure services.
Stars: ✭ 171 (-7.07%)
Mutual labels:  containers
Bitnami Docker Redmine
Bitnami Docker Image for Redmine
Stars: ✭ 172 (-6.52%)
Mutual labels:  containers
Dlib
Allocators, I/O streams, math, geometry, image and audio processing for D
Stars: ✭ 182 (-1.09%)
Mutual labels:  containers
Dockercon19
DockerCon "Docker for Node.js" examples
Stars: ✭ 176 (-4.35%)
Mutual labels:  containers

Build Status

Scalable multi-node Cassandra deployment on Kubernetes Cluster

Read this in other languages: 한국어中国.

This project demonstrates the deployment of a multi-node scalable Cassandra cluster on Kubernetes. Apache Cassandra is a massively scalable open source NoSQL database. Cassandra is perfect for managing large amounts of structured, semi-structured, and unstructured data across multiple datacenters and the cloud.

Leveraging Kubernetes concepts such as PersistentVolume and StatefulSets, we can provide a resilient installation of Cassandra and be confident that its data (state) are safe.

We also utilize a "headless" service for Cassandra. This way we can provide a way for applications to access it via KubeDNS and not expose it to the outside world. To access it from your developer workstation you can use kubectl exec commands against any of the cassandra pods. If you do wish to connect an application to it you can use the KubeDNS value of cassandra.default.svc.cluster.local when configuring your application.

kube-cassandra

Kubernetes Concepts Used

Included Components

Getting Started

In order to follow this guide you'll need a Kubernetes cluster. If you do not have access to an existing Kubernetes cluster then follow the instructions (in the link) for one of the following:

The code here is regularly tested against Kubernetes Cluster from Bluemix Container Service using Travis CI.

After installing (or setting up your access to) Kubernetes ensure that you can access it by running the following and confirming you get version responses for both the Client and the Server:

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.5", GitCommit:"17d7182a7ccbb167074be7a87f0a68bd00d58d97", GitTreeState:"clean", BuildDate:"2017-08-31T09:14:02Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.5", GitCommit:"17d7182a7ccbb167074be7a87f0a68bd00d58d97", GitTreeState:"clean", BuildDate:"2017-09-18T20:30:29Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

Create a Cassandra Service for Cassandra cluster formation and discovery

1. Create a Cassandra Headless Service

To allow us to do simple discovery of the cassandra seed node (which we will deploy shortly) we can create a "headless" service. We do this by specifying none for the clusterIP in the cassandra-service.yaml. This headless service allows us to use KubeDNS for the Pods to discover the IP address of the Cassandra seed.

You can create the headless service using the cassandra-service.yaml file:

$ kubectl create -f cassandra-service.yaml
service "cassandra" created
$ kubectl get svc cassandra
NAME        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
cassandra   None         <none>        9042/TCP   10s

Most applications deployed to Kubernetes should be cloud native and rely on external resources for their data (or state). However since Cassandra is a database we can use Stateful sets and Persistent Volumes to ensure resiliency in our database.

2. Create Local Volumes

To create persistent Cassandra nodes, we need to provision Persistent Volumes. There are two ways to provision PV's: dynamically and statically.

For the sake of simplicity and compatibility we will use Static provisioning where we will create volumes manually using the provided yaml files.

note: You'll need to have the same number of Persistent Volumes as the number of your Cassandra nodes. If you are expecting to have 3 Cassandra nodes, you'll need to create 3 Persistent Volumes.

The provided local-volumes.yaml file already has 3 Persistent Volumes defined. Update the file to add more if you expect to have greater than 3 Cassandra nodes. Create the volumes:

$ kubectl create -f local-volumes.yaml
$ kubectl get pv
NAME               CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS      CLAIM     STORAGECLASS   REASON    AGE
cassandra-data-1   1Gi        RWO           Recycle         Available                                      7s
cassandra-data-2   1Gi        RWO           Recycle         Available                                      7s
cassandra-data-3   1Gi        RWO           Recycle         Available                                      7s

3. Create a StatefulSet

The StatefulSet is responsible for creating the Pods. It provides ordered deployment, ordered termination and unique network names. Run the following command to start a single Cassandra server:

$ kubectl create -f cassandra-statefulset.yaml

4. Validate the StatefulSet

You can check if your StatefulSet has deployed using the command below.

$ kubectl get statefulsets
NAME        DESIRED   CURRENT   AGE
cassandra   1         1         2h

If you view the list of the Pods, you should see 1 Pod running. Your Pod name should be cassandra-0 and the next pods would follow the ordinal number (cassandra-1, cassandra-2,..) Use this command to view the Pods created by the StatefulSet:

$ kubectl get pods -o wide
NAME          READY     STATUS    RESTARTS   AGE       IP              NODE
cassandra-0   1/1       Running   0          1m        172.xxx.xxx.xxx   169.xxx.xxx.xxx

To check if the Cassandra node is up, perform a nodetool status:

$ kubectl exec -ti cassandra-0 -- nodetool status
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)   Host ID                               Rack
UN  172.xxx.xxx.xxx  109.28 KB  256          100.0%             6402e90d-7995-4ee1-bb9c-36097eb2c9ec  Rack1

5. Scale the StatefulSet

To increase or decrease the size of your StatefulSet you can use the scale command:

$ kubectl scale --replicas=3 statefulset/cassandra

Wait a minute or two and check if it worked:

$ kubectl get statefulsets
NAME        DESIRED   CURRENT   AGE
cassandra   3         3         2h

If you watch the Cassandra pods deploy, they should be created sequentially.

You can view the list of the Pods again to confirm that your Pods are up and running.

$ kubectl get pods -o wide
NAME          READY     STATUS    RESTARTS   AGE       IP                NODE
cassandra-0   1/1       Running   0          13m       172.xxx.xxx.xxx   169.xxx.xxx.xxx
cassandra-1   1/1       Running   0          38m       172.xxx.xxx.xxx   169.xxx.xxx.xxx
cassandra-2   1/1       Running   0          38m       172.xxx.xxx.xxx   169.xxx.xxx.xxx

You can perform a nodetool status to check if the other cassandra nodes have joined and formed a Cassandra cluster.

Note: It can take around 5 minutes for the Cassandra database to finish its setup.

$ kubectl exec -ti cassandra-0 -- nodetool status
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       Tokens       Owns (effective)  Host ID                               Rack
UN  172.xxx.xxx.xxx  103.25 KiB  256          68.7%             633ae787-3080-40e8-83cc-d31b62f53582  Rack1
UN  172.xxx.xxx.xxx  108.62 KiB  256          63.5%             e95fc385-826e-47f5-a46b-f375532607a3  Rack1
UN  172.xxx.xxx.xxx  177.38 KiB  256          67.8%             66bd8253-3c58-4be4-83ad-3e1c3b334dfd  Rack1

You will need to wait for the status of the nodes to be Up and Normal (UN) to execute the commands in the next steps.

6. Using CQL

You can access the cassandra container using the following command:

kubectl exec -it cassandra-0 cqlsh
Connected to Cassandra at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.1 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh> describe tables

Keyspace system_traces
----------------------
events  sessions

Keyspace system_schema
----------------------
tables     triggers    views    keyspaces  dropped_columns
functions  aggregates  indexes  types      columns

Keyspace system_auth
--------------------
resource_role_permissons_index  role_permissions  role_members  roles

Keyspace system
---------------
available_ranges          peers               batchlog        transferred_ranges
batches                   compaction_history  size_estimates  hints
prepared_statements       sstable_activity    built_views
"IndexInfo"               peer_events         range_xfers
views_builds_in_progress  paxos               local

Keyspace system_distributed
---------------------------
repair_history  view_build_status  parent_repair_history

License

This code pattern is licensed under the Apache Software License, Version 2. Separate third party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 (DCO) and the Apache Software License, Version 2.

Apache Software License (ASL) FAQ

Troubleshooting

  • If your Cassandra instance is not running properly, you may check the logs using
    • kubectl logs <your-pod-name>
  • To clean/delete your data on your Persistent Volumes, delete your PVCs using
    • kubectl delete pvc -l app=cassandra
  • If your Cassandra nodes are not joining, delete your controller/statefulset then delete your Cassandra service.
    • kubectl delete statefulset cassandra if you created the Cassandra StatefulSet
    • kubectl delete svc cassandra
  • To delete everything:
    • kubectl delete statefulset,pvc,pv,svc -l app=cassandra

References

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].