All Projects → sfproductlabs → jelass

sfproductlabs / jelass

Licence: other
Janus + Elastic Search + Cassandra docker container with SSL Client Certificates implemented.

Programming Languages

Dockerfile
14818 projects
shell
77523 projects

Projects that are alternatives of or similar to jelass

janusgraph-docker
Yet another JanusGraph, Cassandra/Scylla and Elasticsearch in Docker Compose setup
Stars: ✭ 54 (+315.38%)
Mutual labels:  cassandra, scylla, gremlin, tinkerpop, janusgraph
Janusgraph
JanusGraph: an open-source, distributed graph database
Stars: ✭ 4,277 (+32800%)
Mutual labels:  cassandra, gremlin, tinkerpop, bigtable
vscode-cql
CQL language support for VS Code.
Stars: ✭ 15 (+15.38%)
Mutual labels:  cassandra, cassandra-cql, cassandra-database
AspNetCore.Identity.Cassandra
Cassandra Storage Provider for ASP.NET Core Identity
Stars: ✭ 13 (+0%)
Mutual labels:  cassandra, cassandra-database, datastax
lib
Perl Utility Library for my other repos
Stars: ✭ 16 (+23.08%)
Mutual labels:  cassandra, datastax
cassandra-nginx-cdn
Some config files and POC code to use Apache Cassandra as distributed storage for HLS chunks accross multiple datacenters and scripts for converting/transcoding UDP MPEG-TS to HLS and vice versa. The idea is take from Globo.com’s Live Video Platform for FIFA World Cup ’14.
Stars: ✭ 24 (+84.62%)
Mutual labels:  cassandra, cassandra-database
diesel
No description or website provided.
Stars: ✭ 30 (+130.77%)
Mutual labels:  cassandra, cassandra-cql
Scylla
NoSQL data store using the seastar framework, compatible with Apache Cassandra
Stars: ✭ 7,393 (+56769.23%)
Mutual labels:  cassandra, scylla
Express Cassandra
Cassandra ORM/ODM/OGM for Node.js with optional support for Elassandra & JanusGraph
Stars: ✭ 163 (+1153.85%)
Mutual labels:  cassandra, gremlin
Cassandra-Data-Modeling
Basic Rules of Cassandra Data Modeling
Stars: ✭ 29 (+123.08%)
Mutual labels:  cassandra, cassandra-cql
gizmo
OGM
Stars: ✭ 20 (+53.85%)
Mutual labels:  gremlin, tinkerpop
cassandra-data-apis
Data APIs for Apache Cassandra
Stars: ✭ 18 (+38.46%)
Mutual labels:  cassandra, datastax
docker-elassandra
Docker Image packaging for Elassandra
Stars: ✭ 25 (+92.31%)
Mutual labels:  cassandra, elassandra
cassandra-client
Cassandra 3 GUI client
Stars: ✭ 49 (+276.92%)
Mutual labels:  cassandra, cassandra-database
gun-cassandra
Cassandra / Elassandra persistence layer for Gun DB 🔫
Stars: ✭ 14 (+7.69%)
Mutual labels:  cassandra, elassandra
cassandra-top
Cassandra top command to monitor cluster without Datastax OpsCenter, and log nodetool administrative commands
Stars: ✭ 13 (+0%)
Mutual labels:  cassandra, cassandra-database
Unipop
Data Integration Graph
Stars: ✭ 184 (+1315.38%)
Mutual labels:  gremlin, tinkerpop
Gremlin Javascript
JavaScript tools for graph processing in Node.js and the browser inspired by the Apache TinkerPop API
Stars: ✭ 209 (+1507.69%)
Mutual labels:  gremlin, tinkerpop
ecchronos
Ericsson distributed repair scheduler for Apache Cassandra
Stars: ✭ 22 (+69.23%)
Mutual labels:  cassandra, cassandra-database
janusgraph-deployement
A dockerized environment For [JanusGraph + ElasticSearch + Cassandra + GraphExp]
Stars: ✭ 16 (+23.08%)
Mutual labels:  cassandra, janusgraph

Jelass (jĕl′əs), a Linearly Scalable, Searchable, NoSQL and Graph Database

With a database like this, all your friends will be jealous.

Jelass = JanusGraph + Elassandra (Elastic Search + Cassandra)

Elassandra stores Elastic data on Cassandra. So there's no double up on this system. Cassandra is the boss. Elastic runs on top of it and allows it to be useful (searchable, querying etc.). Janus comes to town and adds all the graph functionality LinkedIn could ever need. All under the one roof.

How is this different from straight Janus? Janus' elastic data isn't stored in Cassandra. This has all 3 together. Bullet-proof.

Download Docker Image

https://hub.docker.com/repository/docker/sfproductlabs/jelass

Running in docker-compose

See example https://github.com/sfproductlabs/tracker/blob/master/docker-compose.yml

Ensure you have enough memory.

Connecting

  • cqlsh --ssl
  • Remotely: ./bin/gremlin.sh then :remote connect tinkerpop.server conf/remote.yaml
  • Or locally: ./bin/gremlin.sh then graph = JanusGraphFactory.open('conf/gremlin-server/janusgraph-cql-es-server.properties')
  • etc.

Starting out

Starting with JanusGraph

Then try the basic demo:

On the console hosting docker run:

docker ps
#then replace [container_number] with your docker container hash
docker exec -it [container_number] bash

Then inside the docker container:

cd /app/ela/janusgraph-full-0.5.2
./bin/gremlin.sh

Then inside the gremlin> console (also works remotely) you may need to change the ip:

cluster = Cluster.open('conf/remote-objects.yaml')
graph = EmptyGraph.instance()
g = graph.traversal().withRemote(DriverRemoteConnection.using(cluster, "g"))
// graph = EmptyGraph.instance()
// g = graph.traversal().withRemote('conf/remote-graph.properties')
// TinkerPop Predicates
g.V().has('age',within(5000))
g.V().has('age',without(5000))
g.V().has('age',within(5000,45))
g.V().has('age',inside(45,5000)).valueMap(true)
g.V().and(has('age',between(45,5000)),has('name',within('pluto'))).valueMap(true)
g.V().or(has('age',between(45,5000)),has('name',within('pluto','neptune'))).valueMap(true)

// Janus Graph Geo Predicates
g.E().has('place', geoIntersect(Geoshape.circle(37.97, 23.72, 50)))
g.E().has('place', geoWithin(Geoshape.circle(37.97, 23.72, 50)))
g.E().has('place', geoDisjoint(Geoshape.circle(37.97, 23.72, 50)))

// master branch only
g.addV().property('place', Geoshape.circle(37.97, 23.72, 50))
g.V().has('place', geoContains(Geoshape.point(37.97, 23.72)))

// Janus Graph Text Predicates
g.V().has('name',textContains('neptune')).valueMap(true)
g.V().has('name',textContainsPrefix('nep')).valueMap(true)
g.V().has('name',textContainsRegex('nep.*')).valueMap(true)
g.V().has('name',textPrefix('n')).valueMap(true)
g.V().has('name',textRegex('.*n.*')).valueMap(true)

// master branch only
g.V().has('name',textContainsFuzzy('neptun')).valueMap(true)
g.V().has('name',textFuzzy('nepitne')).valueMap(true)

You can also run the examples locally:

graph = JanusGraphFactory.open('conf/gremlin-server/janusgraph-cql-es-server.properties')
GraphOfTheGodsFactory.load(graph)
g = graph.traversal()
saturn = g.V().has('name', 'saturn').next()
g.V(saturn).valueMap()
g.V(saturn).in('father').in('father').values('name')

//Add a fulltext index on a new property alias
mgmt = graph.openManagement()
summary = mgmt.makePropertyKey('alias').dataType(String.class).make()
mgmt.buildIndex('alias', Vertex.class).addKey(summary, Mapping.TEXTSTRING.asParameter()).buildMixedIndex("search")
mgmt.commit()
g.addV('person').property('alias','bob')
g.V().has('alias', textContains('bob')).hasNext()
graph.tx().commit()

Dropping a graph

./bin/gremlin.sh

graph = JanusGraphFactory.open('conf/gremlin-server/janusgraph-cql-es-server.properties')
g = graph.traversal()
g.V().drop().iterate()

or

JanusGraphFactory.drop(graph);

Checking Schema

mgmt = graph.openManagement()
mgmt.printSchema()

Using Elassandra (Cassandra + Elastic Search)

https://elassandra.readthedocs.io/

Post-install

On a production environment, we recommand to to modify some system settings such as disabling swap. This guide shows you how to do it. On linux, you should install jemalloc.

Optimizing for Janusgraph Batch writes

Setup batch loading for the service:

echo "storage.batch-loading=true" >> ./conf/gremlin-server/janusgraph-cql-es-server.properties
echo "schema.default=none" >> ./conf/gremlin-server/janusgraph-cql-es-server.properties

Visualization of Janus

docker run -p 8889:8888 -d --name graph-explorer sfproductlabs/graph-explorer:latest

Open the Url: http://localhost:8889

Then connect to: ws://localhost:8182/gremlin

The creator created a great little CRUD intro.

After you have created your first few nodes and edges try this in the query editor:

nodes=g.V().toList();edges=g.E().toList();[nodes,edges]

Cassandra Tools

https://cassandra.apache.org/third-party/

Backup & Restore

Backup a single instance (example uses keyspace scrp replace your keyspace name with this):

cqlsh --ssl -e "desc scrp" > /tmp/scrp.cql
nodetool snapshot scrp
cd /var/lib/cassandra
tar -czvf /tmp/scrp.tgz $(find . -type f | grep 1603309754293)

Restore the instance by copying into a directory:

cd /tmp/
cqlsh --ssl -f /tmp/scrp.cql
tar -xzvf /tmp/scrp.tgz
cd /tmp/data/
find . -type f -execdir mv {} ../.. \;
cd scrp
for x in *;do sstableloader -v --conf-path /etc/cassandra/cassandra.yaml -d 172.19.0.3 /tmp/data/scrp/$x;done

Consolidating and Managing Backups across Datacenters

I personally use a grandfather,father,son model for backups using a tool called Borg:

https://www.borgbackup.org/demo.html

Diagnostics

curl -XGET http://$CASSANDRA_HOST:9200/_cluster/state?pretty
nodetool repair -full
nodetool cleanup
nodetool flush
#nodetool rebuild_index sfpla events_recent events_recent_idx
nodetool gossipinfo
nodetool tpstats
nodetool describecluster
nodetool statusthrift
nodetool statusgossip
nodetool ring
nodetool status
nodetool status elastic_admin
nodetool cfstats | grep read | grep latency
#less /var/log/cassandra/system.log
# ...
#cqlsh --ssl
#cqlsh>select * from elastic_admin.Metadata_log;

Using Python

https://docs.janusgraph.org/connecting/python/

  • TODO: Connecting to spark/superset

TODO

  • TODO: Visualization in Elassandra. Superset. Spark.

Running & Ready for Production

  • Docker with SSL by default
  • Nginx SSL for elastic search (Available on port 443 & port 9343, using nginx reverse proxy)
  • Cassandra client and server keystores by default
  • TODO: add nginx streaming SSL for tinkerpop on 8182

Make sure to update the replication factor of "elastic_admin"

Ex. alter keyspace elastic_admin WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1' : 2};

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].