All Projects → linkedconnections → linked-connections-server

linkedconnections / linked-connections-server

Licence: GPL-3.0 license
Express based server that exposes Linked Connections.

Programming Languages

javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to linked-connections-server

retro-gtfs
Collect real-time transit data and process it into a retroactive GTFS 'schedule' which can be used for routing/analysis
Stars: ✭ 45 (+275%)
Mutual labels:  gtfs, gtfs-realtime, public-transport
Pyld
JSON-LD processor written in Python
Stars: ✭ 413 (+3341.67%)
Mutual labels:  linked-data, semantic-web
Topic Db
TopicDB is a topic maps-based semantic graph store (using PostgreSQL for persistence)
Stars: ✭ 164 (+1266.67%)
Mutual labels:  linked-data, semantic-web
Jsonld.js
A JSON-LD Processor and API implementation in JavaScript
Stars: ✭ 1,212 (+10000%)
Mutual labels:  linked-data, semantic-web
Semanticmediawiki
🔗 Semantic MediaWiki turns MediaWiki into a knowledge management platform with query and export capabilities
Stars: ✭ 359 (+2891.67%)
Mutual labels:  linked-data, semantic-web
LDWizard
A generic framework for simplifying the creation of linked data.
Stars: ✭ 17 (+41.67%)
Mutual labels:  linked-data, semantic-web
Informationmodel
The Information Model of the International Data Spaces implements the IDS reference architecture as an extensible, machine readable and technology independent data model.
Stars: ✭ 27 (+125%)
Mutual labels:  linked-data, semantic-web
LinkedDataHub
The Knowledge Graph notebook. Apache license.
Stars: ✭ 150 (+1150%)
Mutual labels:  linked-data, semantic-web
Web Client
Generic Linked Data browser and UX component framework. Apache license.
Stars: ✭ 105 (+775%)
Mutual labels:  linked-data, semantic-web
Rdflib
RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.
Stars: ✭ 1,584 (+13100%)
Mutual labels:  linked-data, semantic-web
Hypergraphql
GraphQL interface for querying and serving linked data on the Web.
Stars: ✭ 112 (+833.33%)
Mutual labels:  linked-data, semantic-web
Rdf
RDF.rb is a pure-Ruby library for working with Resource Description Framework (RDF) data.
Stars: ✭ 353 (+2841.67%)
Mutual labels:  linked-data, semantic-web
Schema Dts
JSON-LD TypeScript types for Schema.org vocabulary
Stars: ✭ 338 (+2716.67%)
Mutual labels:  linked-data, semantic-web
Grafter
Linked Data & RDF Manufacturing Tools in Clojure
Stars: ✭ 174 (+1350%)
Mutual labels:  linked-data, semantic-web
Jsonld
JSON-LD processor for PHP
Stars: ✭ 280 (+2233.33%)
Mutual labels:  linked-data, semantic-web
Dokieli
💡 dokieli is a clientside editor for decentralised article publishing, annotations and social interactions
Stars: ✭ 582 (+4750%)
Mutual labels:  linked-data, semantic-web
Php Json Ld
PHP implementation of a JSON-LD Processor and API
Stars: ✭ 246 (+1950%)
Mutual labels:  linked-data, semantic-web
sparql-micro-service
SPARQL micro-services: A lightweight approach to query Web APIs with SPARQL
Stars: ✭ 22 (+83.33%)
Mutual labels:  linked-data, semantic-web
awesome-ontology
A curated list of ontology things
Stars: ✭ 73 (+508.33%)
Mutual labels:  linked-data, semantic-web
Limes
Link Discovery Framework for Metric Spaces.
Stars: ✭ 94 (+683.33%)
Mutual labels:  linked-data, semantic-web

Linked Connections Server

Coverage Status

Express based Web Server that exposes Linked Connections data fragments using JSON-LD serialization format. It also provides a built-in tool to parse GTFS and GTFS Realtime transport dataset feeds into a Linked Connections Directed Acyclic Graph using GTFS2LC and fragment it following a configurable predefined size.

Installation

First make sure to have Node 11.7.x or superior installed. To install the server proceed as follows:

git clone https://github.com/julianrojas87/linked-connections-server.git
cd linked-connections-server
npm install

Configuration

The configuration is made through two different config files. One is for defining Web Server parameters (server_config.json) and the other is for defining the different data sources that will be managed and exposed through the Linked Connections Server (datasets_config.json). Next you could find an example and a description of each config file.

Web Server configuration

As mentioned above the Web server configuration is made using the (server_config.json) config file which uses the JSON format and defines the following properties:

  • hostname: Used to define the Web Server host name. Is a mandatory parameter.

  • port: TCP/IP port to be used by the Web Server to receive requests. Is a mandatory parameter.

  • protocol: Used to define the accepted protocol by the Web Server which could be either HTTP o HTTPS. In case that both protocols are supported there is no need to define this parameter, but all requests made to the server MUST contain the X-Forwarded-Proto header stating the procotol being used. This is useful when the server is used along with cache management servers.

  • logLevel: Used to define the logging level of the server. We use the Winston library to manage logs. If not specified, the default level is info.

This is a configuration example:

{
    "hostname": "localhost:3000",
    "port": 3000,
    "protocol": "http" // or https
    "logLevel": "info" //error, warn, info, verbose, debug, silly
}

Datasets configuration

The Web Server does not provide any functionality by itself, it needs at least one dataset (in GTFS format) that can be downloaded to be processed and exposed as Linked Connections. To tell the server where to find and store such datasets, we use the (datasets_config.json) config file. All the parameters in this config file are Mandatory, otherwise the server won't function properly. This file contains the following parameters:

  • storage: This is the path that tells the server where to store and where to look for the data fragments, created from the different datasets. This should not include a trailing slash. Make sure you have enough disk space to store and process datasets.

  • sortMemory: Max amount of RAM memory that can be used by the Linked Connections sorting process. Default is 2G.

  • organization: URI and name of the data publisher.

  • keywords: Related keywords for a given dataset. E.g., types of vehicles available.

  • companyName: Name of the transport company that provides the GTFS dataset feed.

  • geographicArea: GeoNames URI that represents the geographic area served by the public transport provider.

  • downloadUrl: URL where the GTFS dataset feed can be downloaded.

  • downloadOnLaunch: Boolean parameter that indicates if the GTFS feed is to be downloaded and processed upon server launch.

  • updatePeriod: Cron expression that defines how often should the server look for and process a new version of the dataset. We use the node-cron library for this.

  • fragmentSize: Defines the maximum number of connections per data fragment.

  • realTimeData: If available, here we define all the parameters related with a GTFS-RT feed.

    • downloadUrl: Here we define the URL to download the GTFS-RT data feed.
    • headers: Some GTFS-RT feeds require API keys to be accessed.
    • updatePeriod: Cron expression that defines how often should the server look for and process a new version of the dataset. We use the node-cron library for this.
    • fragmentTimeSpan: This defines the fragmentation of real-time data. It represents the time span of every fragment in seconds.
    • compressionPeriod: Cron expression that defines how often will the real-time data be compressed using gzip in order to reduce storage consumption.
    • indexStore: Indicates where the required static indexes (routes, trips, stops and stop_times) will be stored while processing GTFS-RT updates. MemStore for RAM and KeyvStore for disk.
    • deduce: If the GTFS-RT feed does not provide a explicit tripId for every update, set this parameter to true, so they can be identified using additional GTFS indexes.
  • baseURIs: Here we define the URI templates that will be used to create the unique identifiers of each of the entities found in the Linked Connections. Is necessary to define URIs for Connections, Stops, Trips and Routes. This is the only optional parameter and in case that is not defined, all base URIs will have a http://example.org/ pattern, but we recommend to always use dereferenceable URIs. Follow the RFC 6570 specification to define your URIs using the column names of the routes and trips GTFS source files. See an example next.

{
    "storage": "/opt/linked-connections-data", //datasets storage path
    "sortMemory": "4G",
    "organization": {
        "id": "https://...",
        "name": "Organization name"
    },
    "datasets":[
        {
            "companyName": "companyX",
            "keywords": ["Keyword1", "Keyword2"],
            "geographicArea": "http://sws.geonames.org/...", // Geo names URI
            "downloadUrl": "https://...",
            "downloadOnLaunch": false,
            "updatePeriod": "0 0 3 * * *", //every day at 3 am
            "fragmentSize": 1000, // 1000 connections/fragment
            "realTimeData": {
                "downloadUrl": "https://...",
                "headers": { "apiKeyHttpHeader": "my_api_key" },
                "updatePeriod": "*/30 * * * * *", //every 30s
                "fragmentTimeSpan": 600, // 600 seconds
                "compressionPeriod": "0 0 3 * * *", // Every day at 3 am
                "indexStore": "MemStore", // MemStore for RAM and LevelStore for disk processing
                "deduce": true // Set true only if the GTFS-RT feed does not provide tripIds
            },
            "baseURIs": {
                "stop": "http://example.org/stops/{stop_id}",
                "route": "http://example.org/routes/{routes.route_id}",
                "trip": "http://example.org/trips/{routes.route_id}/{trips.startTime(yyyyMMdd)}",
                "connection:" 'http://example.org/connections/{routes.route_id}/{trips.startTime(yyyyMMdd)}{connection.departureStop}'
            }
        },
        {
            "companyName": "companyY",
            "keywords": ["Keyword1", "Keyword2"],
            "geographicArea": "http://sws.geonames.org/...", // Geo names URI
            "downloadUrl": "http://...",
            "downloadOnLaunch": false,
            "updatePeriod": "0 0 3 * * *", //every day at 3am
            "baseURIs": {
                "stop": "http://example.org/stops/{stop_id}",
                "route": "http://example.org/routes/{routes.route_id}",
                "trip": "http://example.org/trips/{routes.route_id}/{trips.startTime(yyyyMMdd)}",
                "connection:" 'http://example.org/connections/{routes.route_id}/{trips.startTime(yyyyMMdd)}{connection.departureStop}'
            }
        }
    ]
}

Note that for defining the URI templates you can use the entity connection which consists of a departureStop, departureTime, arrivalStop and an arrivalTime. We have also noticed that using the start time of a trip (trip.startTime) is also a good practice to uniquely identify trips or even connections. If using any of the times variables you can define a specific format (see here) as shown in the previous example.

Run it

Once you have properly configured the server you can run the data fetching and the Web server separately:

cd linked-connections-server
node bin/datasets # Data fetching
node bin/web-server # Linked Connections Web server

Use it

To use it make sure you already have at least one fully processed dataset (the logs will tell you when). If so you can query the Linked Connections using the departure time as a parameter like this for example:

http://localhost:3000/companyX/connections?departureTime=2017-08-11T16:45:00.000Z

If available, the server will redirect you to the Linked Connections fragment that contains connections with departure times as close as possible to the one requested.

The server also publishes the stops and routes of every defined GTFS datasource:

http://localhost:3000/companyX/stops
http://localhost:3000/companyX/routes

A DCAT catalog describing all datasets of a certain company can be obtained like this:

http://localhost:3000/companyX/catalog

Historic Data

The server also allows querying historic data by means of the Memento Framework which enables time-based content negotiation over HTTP. By using the Accept-Datetime header a client can request the state of a resource at a given moment. If existing, the server will respond with a 302 Found containing the URI of the stored version of such resource. For example:

curl -v -L -H "Accept-Datetime: 2017-10-06T13:00:00.000Z" http://localhost:3000/companyX/connections?departureTime=2017-10-06T15:50:00.000Z

> GET /companyX/connections?departureTime=2017-10-06T15:50:00.000Z HTTP/1.1
> Host: localhost:3000
> User-Agent: curl/7.52.1
> Accept: */*
> Accept-Datetime: 2017-10-06T13:00:00.000Z

< HTTP/1.1 302 Found
< X-Powered-By: Express
< Access-Control-Allow-Origin: *
< Location: /memento/companyX?version=2017-10-28T03:07:47.000Z&departureTime=2017-10-06T15:50:00.000Z
< Vary: Accept-Encoding, Accept-Datetime
< Link: <http://localhost:3000/companyX/connections?departureTime=2017-10-06T15:50:00.000Z>; rel="original timegate"
< Date: Mon, 13 Nov 2017 15:00:36 GMT
< Connection: keep-alive
< Content-Length: 0

> GET /memento/companyX?version=2017-10-28T03:07:47.000Z&departureTime=2017-10-06T15:50:00.000Z HTTP/1.1
> Host: localhost:3000
> User-Agent: curl/7.52.1
> Accept: */*
> Accept-Datetime: 2017-10-06T13:00:00.000Z

< HTTP/1.1 200 OK
< X-Powered-By: Express
< Memento-Datetime: Fri, 06 Oct 2017 13:00:00 GMT
< Link: <http://localhost:3000/companyX/connections?departureTime=2017-10-06T15:50:00.000Z>; rel="original timegate"
< Access-Control-Allow-Origin: *
< Content-Type: application/ld+json; charset=utf-8
< Content-Length: 289915
< ETag: W/"46c7b-TOdDIcDjCvUXTC/gzqr5hxVDZjg"
< Date: Mon, 13 Nov 2017 15:00:36 GMT
< Connection: keep-alive

The previous example shows a request made to obtain the Connections fragment identified by the URL http://localhost:3000/companyX/connections?departureTime=2017-10-06T15:50:00.000Z, but specifically the state of this fragment as it was at Accept-Datetime: 2017-10-06T13:00:00.000Z. This means that is possible to know what was the state of the delays at 13:00 for the departures at 15:50 on 2017-10-06.

Authors

Julian Rojas - [email protected]
Pieter Colpaert - [email protected]

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].