All Projects → sunsided → janusgraph-docker

sunsided / janusgraph-docker

Licence: other
Yet another JanusGraph, Cassandra/Scylla and Elasticsearch in Docker Compose setup

Programming Languages

python
139335 projects - #7 most used programming language
groovy
2714 projects
shell
77523 projects

Projects that are alternatives of or similar to janusgraph-docker

jelass
Janus + Elastic Search + Cassandra docker container with SSL Client Certificates implemented.
Stars: ✭ 13 (-75.93%)
Mutual labels:  cassandra, scylla, gremlin, tinkerpop, janusgraph
Janusgraph
JanusGraph: an open-source, distributed graph database
Stars: ✭ 4,277 (+7820.37%)
Mutual labels:  cassandra, graph-database, gremlin, tinkerpop
Cypher.js
Cypher graph database for Javascript
Stars: ✭ 30 (-44.44%)
Mutual labels:  graph-database, cypher, cypher-query-language
Ferma
An ORM / OGM for the TinkerPop graph stack.
Stars: ✭ 130 (+140.74%)
Mutual labels:  graph-database, gremlin, tinkerpop
Tinkerpop
Apache TinkerPop - a graph computing framework
Stars: ✭ 1,309 (+2324.07%)
Mutual labels:  graph-database, gremlin, tinkerpop
Gremlin Javascript
JavaScript tools for graph processing in Node.js and the browser inspired by the Apache TinkerPop API
Stars: ✭ 209 (+287.04%)
Mutual labels:  graph-database, gremlin, tinkerpop
Cypher For Gremlin
Cypher for Gremlin adds Cypher support to any Gremlin graph database.
Stars: ✭ 267 (+394.44%)
Mutual labels:  gremlin, tinkerpop, cypher
seabolt
Neo4j Bolt Connector for C
Stars: ✭ 37 (-31.48%)
Mutual labels:  graph-database, cypher, cypher-query-language
Movies Javascript Bolt
Neo4j Movies Example with webpack-in-browser app using the neo4j-javascript-driver
Stars: ✭ 123 (+127.78%)
Mutual labels:  graph-database, cypher
Gremlin Orm
gremlin-orm is an ORM for graph databases in Node.js
Stars: ✭ 136 (+151.85%)
Mutual labels:  graph-database, gremlin
Neo4j 3d Force Graph
Experiments with Neo4j & 3d-force-graph https://github.com/vasturiano/3d-force-graph
Stars: ✭ 159 (+194.44%)
Mutual labels:  graph-database, cypher
Movies Python Bolt
Neo4j Movies Example application with Flask backend using the neo4j-python-driver
Stars: ✭ 197 (+264.81%)
Mutual labels:  graph-database, cypher
Redisgraph
A graph database as a Redis module
Stars: ✭ 1,292 (+2292.59%)
Mutual labels:  graph-database, cypher
Neo4j
Graphs for Everyone
Stars: ✭ 9,582 (+17644.44%)
Mutual labels:  graph-database, cypher
Movies Java Bolt
Neo4j Movies Example application with SparkJava backend using the neo4j-java-driver
Stars: ✭ 66 (+22.22%)
Mutual labels:  graph-database, cypher
Neo4j Etl
Data import from relational databases to Neo4j.
Stars: ✭ 165 (+205.56%)
Mutual labels:  graph-database, cypher
Hgraphdb
HBase as a TinkerPop Graph Database
Stars: ✭ 226 (+318.52%)
Mutual labels:  graph-database, tinkerpop
Scylla
NoSQL data store using the seastar framework, compatible with Apache Cassandra
Stars: ✭ 7,393 (+13590.74%)
Mutual labels:  cassandra, scylla
mizo
Super-fast Spark RDD for Titan Graph Database on HBase
Stars: ✭ 24 (-55.56%)
Mutual labels:  graph-database, titan
Llvm2graphml
Explore LLVM Bitcode interactively using a graph database
Stars: ✭ 44 (-18.52%)
Mutual labels:  graph-database, gremlin

JanusGraph (in Docker) - Lessons learned

JanusGraph logo

Docker deployment of JanusGraph. To run,

docker-compose up --build

Note that a version of Docker Compose with support for version 3 schemas is required, e.g. 1.15.0 or newer.

Afterwards, you can connect to the local Gremlin shell using

docker-compose exec janus ./bin/gremlin.sh

The python-test subdirectories contains some simplistic Python scripts to test communication with JanusGraph.

Sources for the Dockerfile and their surroundings were basically taken straight from Titan setups:

For multiple graphs in Titan (and likely also JanusGraph), follow these links:

Scylla/Cassandra and Elasticsearch

As per compatibility matrix, the supported Cassandra version is 3.11 and the supported Elasticsearch version is 6.6. This repository uses Scylla instead of Cassandra, and according to the Scylla Cassandra Compatibility matrix we find that Scylla 3.0 is a drop-in replacement for Cassandra 3.11.

The latest commit using Cassandra in this repo is 39c537de03a1bb7a65138b535df1ff003e8c4ec6, if you are interested in that.

Gremlin Shell

This Docker example loads an airline graph and exposes it as graph g in scripts/airlines-sample.groovy.

After opening the Gremlin shell in Docker by running e.g.

docker-compose exec janus ./bin/gremlin.sh

You should be greeted by the Gremlin REPL shell:

         \,,,/
         (o o)
-----oOOo-(3)-oOOo-----
plugin activated: tinkerpop.server
plugin activated: tinkerpop.hadoop
plugin activated: tinkerpop.utilities
plugin activated: aurelius.titan
plugin activated: tinkerpop.tinkergraph
gremlin>

From here, connect to JanusGraph with a session, then forward all commands to the remote server using :remote console (this allows skipping the :> syntax that's required otherwise):

:remote connect tinkerpop.server conf/remote.yaml session
:remote console

You will find the airlines data exposed as graph g. We can inspect the vertex count by running e.g.

g.V().count()

This should return a value of 47. Note that after restarting, the graph is imported again, resulting in data duplication. To drop all vertices and edges - and then re-import from scratch - we can run

g.V().drop().iterate()
airlines.io(graphml()).readGraph('data/air-routes-small.graphml')
g.tx().commit()

To build an index over the code property, run

mgmt = airlines.openManagement()
code = mgmt.getPropertyKey('code')
mgmt.buildIndex('byCodeUnique', Vertex.class).addKey(code).unique().buildCompositeIndex()
mgmt.commit()
airlines.tx().commit()

We can then - for example - get all properties of the vertex with code JFK:

g.V().has('code', 'JFK').valueMap()

This should return:

==>{code=[JFK], type=[airport], desc=[New York John F. Kennedy International Airport], country=[US], longest=[14511], city=[New York], elev=[12], icao=[KJFK], lon=[-73.77890015], region=[US-NY], runways=[4], lat=[40.63980103]}

We could now run path queries, e.g. find a path between Honolulu International and Houston Hobby and return the airport codes and city names:

g.V().has('code', 'HNL').repeat(out().simplePath()).until(has('code', 'HOU')).path().by(valueMap('code', 'city')).limit(1)

This should return:

==>path[{code=[HNL], city=[Honolulu]}, {code=[DFW], city=[Dallas]}, {code=[HOU], city=[Houston]}]

To leave the shell, type :quit.

Channelizers

You have to choose the Channelizer to work with, e.g. HttpChannelizer, WebSocketChannelizer or WsAndHttpChannelizer/JanusGraphWsAndHttpChannelizer.

Using the JanusGraphWsAndHttpChannelizer

channelizer: org.janusgraph.channelizers.JanusGraphWsAndHttpChannelizer

allows for HTTP access to JanusGraph, allowing to e.g. determine 100 - 1 (hint: it's 99)

curl "http://localhost:8182/?gremlin=100-1"

or running complete queries (URL encoded):

curl http://localhost:8182/?gremlin=g.V().has(%27code%27,%20%27JFK%27).valueMap()

... which is a bit clearer when using a JSON POST:

curl -X POST http://localhost:8182/ \
  -H 'Content-Length: 52' \
  -H 'Content-Type: application/json' \
  -H 'Host: localhost:8182' \
  -d '{ "gremlin": "g.V().has('\''code'\'', '\''JFK'\'').valueMap()" }'

Python connectivity

To test connectivity with Python, try out the scripts in the python-test/ directory. A conda environment is provided in environment.yaml:

conda env create -f environment.yaml
conda activate janusgraph

Try running the gremlinpython example:

python test_gremlin_python.py

This should output:

Hop 1: HNL - Honolulu
Hop 2: DFW - Dallas
Hop 3: HOU - Houston

Note that the aiogremlin example is notoriously broken; that's presumably because the package lags behind the TinkerPop version quite a bit.

Cypher support

With Cypher for Gremlin (Opencypher), you can query Janusgraph using the Cypher query language originating from Neo4j. This repo provides a configuration that installs the required plugins.

Cypher for Gremlin

Note that while the examples in this section work out of the box, some Java drivers will fail with serialization issues such as Encountered unregistered class ID: 65536. This happens especially in Gremlin- or Cypher-enabled applications that do not register JanusGraph's serializers, e.g. in the Intellij Graph Database support plugin (see this ticket). In order to have Cypher support working in those situations, you will need to "undo" Janusgraph specifics by doing the following changes.

In gremlin-server.yaml, replace

  • org.janusgraph.channelizers.JanusGraphWsAndHttpChannelizer with org.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer, and
  • org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry with org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0.

Then, use the GraphSON v3 serializer, since both Gyro and GraphBinary will require you to register the exact types. After this, you should be good to go.

Using the Cypher Traversal Source

To use Cypher alongside with Gremlin, connect to the Gremlin console and run:

:plugin use opencypher.gremlin
g = EmptyGraph.instance().traversal(CypherTraversalSource.class).withRemote('conf/remote-airlines.properties')

Next, run your Cypher command using g.cypher():

g.cypher('MATCH (p:airport) RETURN p.desc AS name')

You can also mix and match Cypher and Gremlin:

g.cypher('MATCH (p:airport) RETURN p').select('p').by(valueMap().select('desc').project('name')).dedup()

Or only use Gremlin:

g.V().hasLabel('airport').as('p').select('p').by(valueMap().select('desc').project('name')).dedup()

Using Cypher directly (in the Gremlin Shell)

In the Gremlin shell, you can also run Cypher queries directly. To do so, run

:plugin use opencypher.gremlin
:remote connect opencypher.gremlin conf/remote-objects.yaml translate gremlin

Alternatively, server-side translations can be used (note the :remote config alias g airlines command!):

:plugin use opencypher.gremlin
:remote connect opencypher.gremlin conf/remote-objects.yaml
:remote config alias g airlines

You can then run Cypher commands directly on the remote source:

:> MATCH (p:airport) RETURN p.desc AS name

Note that :remote console does not work in this case.

The Cypher EXPLAIN command can be used to inspect the equivalent Gremlin query:

gremlin> :> EXPLAIN MATCH (p:airport) RETURN p.desc AS name
==>[translation:g.V().hasLabel('airport').project('name').by(__.choose(__.values('desc'), __.values('desc'), __.constant('  cypher.null'))),options:[EXPLAIN]]
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].