All Projects → sunilsoni → Cassandra-Data-Modeling

sunilsoni / Cassandra-Data-Modeling

Licence: other
Basic Rules of Cassandra Data Modeling

Projects that are alternatives of or similar to Cassandra-Data-Modeling

Data-Engineering-Projects
Personal Data Engineering Projects
Stars: ✭ 167 (+475.86%)
Mutual labels:  cassandra, data-modeling
vscode-cql
CQL language support for VS Code.
Stars: ✭ 15 (-48.28%)
Mutual labels:  cassandra, cassandra-cql
jelass
Janus + Elastic Search + Cassandra docker container with SSL Client Certificates implemented.
Stars: ✭ 13 (-55.17%)
Mutual labels:  cassandra, cassandra-cql
diesel
No description or website provided.
Stars: ✭ 30 (+3.45%)
Mutual labels:  cassandra, cassandra-cql
crystal-cassandra
A Cassandra driver for Crystal
Stars: ✭ 20 (-31.03%)
Mutual labels:  cassandra
cassandra-phantom
Cassandra + Phantom Example
Stars: ✭ 64 (+120.69%)
Mutual labels:  cassandra
ecaudit
Ericsson Audit plug-in for Apache Cassandra
Stars: ✭ 36 (+24.14%)
Mutual labels:  cassandra
janusgraph-docker
Yet another JanusGraph, Cassandra/Scylla and Elasticsearch in Docker Compose setup
Stars: ✭ 54 (+86.21%)
Mutual labels:  cassandra
dockerfiles
Multi docker container images for main Big Data Tools. (Hadoop, Spark, Kafka, HBase, Cassandra, Zookeeper, Zeppelin, Drill, Flink, Hive, Hue, Mesos, ... )
Stars: ✭ 29 (+0%)
Mutual labels:  cassandra
ticktock
An OpenTSDB-like time series database, with much better performance.
Stars: ✭ 34 (+17.24%)
Mutual labels:  time-series-database
battlestax
BattleStax is a stateful JAMStack game that is wholesome fun for the entire crew.
Stars: ✭ 32 (+10.34%)
Mutual labels:  cassandra
cassandra-exporter
Simple Tool to Export / Import Cassandra Tables into JSON
Stars: ✭ 44 (+51.72%)
Mutual labels:  cassandra
time-series-data-collector
Time series data collector / exporter
Stars: ✭ 13 (-55.17%)
Mutual labels:  time-series-database
cassandra-data-apis
Data APIs for Apache Cassandra
Stars: ✭ 18 (-37.93%)
Mutual labels:  cassandra
docker-cassandra-k8s
Cassandra Docker optimized for Kubernetes
Stars: ✭ 13 (-55.17%)
Mutual labels:  cassandra
cassandra-prometheus
prometheus exporter for cassandra
Stars: ✭ 25 (-13.79%)
Mutual labels:  cassandra
cassandra-web
cassandra web ui
Stars: ✭ 61 (+110.34%)
Mutual labels:  cassandra
microservices-transactions
Choreography-based sagas to maintain data consistency in a microservice architecture.
Stars: ✭ 20 (-31.03%)
Mutual labels:  cassandra
HelpingHand
Leveraging Intelligent Processing Tools and Algorithms to help the Visually Impaired see and navigate 💥✨
Stars: ✭ 29 (+0%)
Mutual labels:  data-modeling
casper
Yelp's internal caching proxy, powered by Nginx and OpenResty at its core
Stars: ✭ 81 (+179.31%)
Mutual labels:  cassandra

Cassandra Data Modeling

Introduction:

Cassandra is a partitioned row store, where rows are organized into tables with a required primary key.

The first component of a table’s primary key is the partition key; within a partition, rows are clustered by the remaining columns of the PK. Other columns may be indexed independent of the PK.

This allows pervasive denormalization to "pre-build" resultsets at update time, rather than doing expensive joins across the cluster.

cassandra data modeling - Practical considerations @ netflix

Sample Tables:

CREATE TABLE sensor_readings ( sensorID uuid, time_bucket int, timestamp bigint, reading decimal, PRIMARY KEY ((sensorID, time_bucket), timestamp) ) WITH CLUSTERING ORDER BY (timestamp DESC);

SELECT * FROM sensor_readings WHERE sensorID = 53755080-4676-11e4-916c-0800200c9a66 AND time_bucket IN (1411840800, 1411844400) AND timestamp >= 1411841700 AND timestamp ⇐ 1411845300;

CREATE TABLE IF NOT EXISTS ${keyspace}.traces ( trace_id blob, span_id bigint, span_hash bigint, parent_id bigint, operation_name text, flags int, start_time bigint, duration bigint, tags list<frozen<keyvalue>>, logs list<frozen<log>>, refs list<frozen<span_ref>>, process frozen<process>, PRIMARY KEY (trace_id, span_id, span_hash) ) WITH compaction = { 'compaction_window_size': '1', 'compaction_window_unit': 'HOURS', 'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy' } AND dclocal_read_repair_chance = 0.0 AND default_time_to_live = ${trace_ttl} AND speculative_retry = 'NONE' AND gc_grace_seconds = 10800; — 3 hours of downtime acceptable on nodes

CREATE TABLE IF NOT EXISTS ${keyspace}.duration_index ( service_name text, // service name operation_name text, // operation name, or blank for queries without span name bucket timestamp, // time bucket, - the start_time of the given span rounded to an hour duration bigint, // span duration, in microseconds start_time bigint, trace_id blob, PRIMARY KEY ((service_name, operation_name, bucket), duration, start_time, trace_id) ) WITH CLUSTERING ORDER BY (duration DESC, start_time DESC) AND compaction = { 'compaction_window_size': '1', 'compaction_window_unit': 'HOURS', 'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy' } AND dclocal_read_repair_chance = 0.0 AND default_time_to_live = ${trace_ttl} AND speculative_retry = 'NONE' AND gc_grace_seconds = 10800; — 3 hours of downtime acceptable on nodes

Sequential writes can cause hot spots: If the application tends to write or update a sequential block of rows at a time, the writes will not be distributed across the cluster. They all go to one node. This is frequently a problem for applications dealing with timestamped data

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].