Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

sirixdb / Sirix

Licence: bsd-3-clause

SirixDB is a temporal, evolutionary database system, which uses an accumulate only approach. It keeps the full history of each resource. Every commit stores a space-efficient snapshot through structural sharing. It is log-structured and never overwrites data. SirixDB uses a novel page-level versioning approach called sliding snapshot.

Programming Languages

java

68154 projects - #9 most used programming language

kotlin

9241 projects

Labels

hacktoberfest json xml storage coroutines diff versioning ssd snapshot hashing vertx xpath comparison diffing

Projects that are alternatives of or similar to Sirix

Acl

Server framework and network components written by C/C++ for Linux, Mac, FreeBSD, Solaris(x86), Windows, Android, IOS

Stars: ✭ 2,113 (+231.19%)

Mutual labels: coroutines, json, xml

Horaires Ratp Api

Webservice pour les horaires et trafic RATP en temps réel

Stars: ✭ 232 (-63.64%)

Mutual labels: hacktoberfest, json, xml

Cfgdiff

diff(1) all your configs

Stars: ✭ 138 (-78.37%)

Mutual labels: json, xml, diff

Internettools

XPath/XQuery 3.1 interpreter for Pascal with compatibility modes for XPath 2.0/XQuery 1.0/3.0, custom and JSONiq extensions, XML/HTML parsers and classes for HTTP/S requests

Stars: ✭ 82 (-87.15%)

Mutual labels: json, xml, xpath

Tbox

🎁 A glib-like multi-platform c library

Stars: ✭ 3,800 (+495.61%)

Mutual labels: coroutines, json, xml

Json Git

A pure JS local Git to versionize any JSON

Stars: ✭ 109 (-82.92%)

Mutual labels: versioning, json, diff

Configurate

A simple configuration library for Java applications providing a node structure, a variety of formats, and tools for transformation

Stars: ✭ 148 (-76.8%)

Mutual labels: hacktoberfest, json, xml

Camaro

camaro is an utility to transform XML to JSON, using Node.js binding to native XML parser pugixml, one of the fastest XML parser around.

Stars: ✭ 438 (-31.35%)

Mutual labels: json, xml, xpath

Xidel

Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.

Stars: ✭ 335 (-47.49%)

Mutual labels: json, xml, xpath

Korio

Korio: Kotlin cORoutines I/O : Virtual File System + Async/Sync Streams + Async TCP Client/Server + WebSockets for Multiplatform Kotlin 1.3

Stars: ✭ 282 (-55.8%)

Mutual labels: coroutines, json, xml

Jsondiffpatch

Diff & patch JavaScript objects

Stars: ✭ 3,951 (+519.28%)

Mutual labels: json, diff, diffing

Parsel

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Stars: ✭ 628 (-1.57%)

Mutual labels: hacktoberfest, xml, xpath

Iguana

universal serialization engine

Stars: ✭ 481 (-24.61%)

Mutual labels: json, xml

Folder Explorer

分析文件目录，统计数据并以树形结构和图表的形式展示结果，也可以导出多种格式留存

Stars: ✭ 479 (-24.92%)

Mutual labels: json, xml

Fsharp.data

F# Data: Library for Data Access

Stars: ✭ 631 (-1.1%)

Mutual labels: json, xml

Simd Json

Rust port of simdjson

Stars: ✭ 499 (-21.79%)

Mutual labels: hacktoberfest, json

Vue Ls

💥 Vue plugin for work with local storage, session storage and memory storage from Vue context

Stars: ✭ 468 (-26.65%)

Mutual labels: json, storage

Log4rs

A highly configurable logging framework for Rust

Stars: ✭ 483 (-24.29%)

Mutual labels: json, xml

Basex

BaseX Main Repository.

Stars: ✭ 515 (-19.28%)

Mutual labels: xml, xpath

Name That Hash

🔗 Don't know what type of hash it is? Name That Hash will name that hash type! 🤖 Identify MD5, SHA256 and 3000+ other hashes ☄ Comes with a neat web app 🔥

Stars: ✭ 540 (-15.36%)

Mutual labels: hacktoberfest, hashing

View All Similar Projects ➔

An Evolutionary, Accumulate-Only Database System

Stores small-sized, immutable snapshots of your data and facilitates querying the full history

Download ZIP | Join us on Slack | Community Forum

Working on your first Pull Request? You can learn how from this free series How to Contribute to an Open Source Project on GitHub and another tutorial: How YOU can contribute to OSS, a beginners guide

"Remember that you're lucky, even if you don't think you are, because there's always something that you can be thankful for." - Esther Grace Earl (http://tswgo.org)

SirixDB uses a huge persistent (in the functional sense) tree of tries, wherein the committed snapshots share unchanged pages and even common records in changed pages. The system only stores page-fragments instead of full pages during a commit to reduce write-amplification. During read operations, the system reads the page-fragments in parallel to reconstruct an in-memory page.

SirixDB currently supports the storage and (time travel) querying of both XML - and JSON-data in our binary encoding, tailored to support versioning. The index-structures and the whole storage engine has been written from scratch to support versioning natively. We might also implement the storage and querying of other data formats as relational data.

Note: Work on a Frontend built with Svelte, D3.js, and Typescript has just begun

Discuss it in the Community Forum

Keeping All Versions of Your Data By Sharing Structure
SirixDB Features
- Design Goals
- Revision Histories
Getting Started
Getting Help
- Community Forum
- Join us on Slack
Contributors
License

Keeping All Versions of Your Data By Sharing Structure

We could write quite a bunch of stuff, why it's often of great value to keep all states of your data in a storage system. Still, recently we stumbled across an excellent blog post, which explains the advantages of keeping historical data very well. In a nutshell, it's all about looking at the evolution of your data, finding trends, doing audits, implementing efficient undo-/redo-operations. The Wikipedia page has a bunch of examples. We recently also added use cases over here.

Our firm belief is that a temporal storage system must address the issues, which arise from keeping past states way better than traditional approaches. Usually, storing time-varying, temporal data in database systems that do not support the storage thereof natively results in many unwanted hurdles. They waste storage space, query performance to retrieve past states of your data is not ideal, and usually, temporal operations are missing altogether.

The DBS must store data in a way that storage space is used as effectively as possible while supporting the reconstruction of each revision, as the database saw it during the commits. All this should be handled in linear time, whether it's the first revision or the most recent revision. Ideally, query time of old/past revisions and the most recent revision should be in the same runtime complexity (logarithmic when querying for specific records).

SirixDB not only supports snapshot-based versioning on a record granular level through a novel versioning algorithm called sliding snapshot, but also time travel queries, efficient diffing between revisions and the storage of semi-structured data to name a few.

Executing the following time-travel query to on our binary JSON representation of Twitter sample data gives an initial impression of the possibilities:

let $statuses := jn:open('mycol.jn','mydoc.jn', xs:dateTime('2019-04-13T16:24:27Z'))=>statuses
let $foundStatus := for $status in $statuses
  let $dateTimeCreated := xs:dateTime($status=>created_at)
  where $dateTimeCreated > xs:dateTime("2018-02-01T00:00:00") and not(exists(jn:previous($status)))
  order by $dateTimeCreated
  return $status
return {"revision": sdb:revision($foundStatus), $foundStatus{text}}

The query opens a database/resource in a specific revision based on a timestamp (2019–04–13T16:24:27Z) and searches for all statuses, which have a created_at timestamp, which has to be greater than the 1st of February in 2018 and did not exist in the previous revision. => is a dereferencing operator used to dereference keys in JSON objects, array values can be accessed as shown with the function bit:array-values or through specifying an index, starting with zero: array[[0]] for instance specifies the first value of the array.

SirixDB Features

SirixDB is a log-structured, temporal NoSQL document store, which stores evolutionary data. It never overwrites any data on-disk. Thus, we're able to restore and query the full revision history of a resource in the database.

Design Goals

Some of the most important core principles and design goals are:

Embeddable: Similar to SQLite and DucksDB SirixDB is embeddable at its core. Other APIs as the non-blocking REST-API are built on top.
Minimize Storage Overhead: SirixDB shares unchanged data pages as well as records between revisions, depending on a chosen versioning algorithm during the initial bootstrapping of a resource. SirixDB aims to balance read and writer performance in its default configuration.
Concurrent: SirixDB contains very few locks and aims to be as suitable for multithreaded systems as possible.
Asynchronous: Operations can happen independently; each transaction is bound to a specific revision and only one read/write-transaction on a resource is permitted concurrently to N read-only-transactions.
Versioning/Revision history: SirixDB stores a revision history of every resource in the database without imposing extra overhead. It uses a huge persistent, durable page-tree for indexing revisions and data.
Data integrity: SirixDB, like ZFS, stores full checksums of the pages in the parent pages. That means that almost all data corruption can be detected upon reading in the future, we aim to partition and replicate databases in the future.
Copy-on-write semantics: Similarly to the file systems Btrfs and ZFS, SirixDB uses CoW semantics, meaning that SirixDB never overwrites data. Instead, database-page fragments are copied/written to a new location.
Per revision and page versioning: SirixDB does not only version on a per revision, but also on a per page-base. Thus, whenever we change a potentially small fraction of records in a data-page, it does not have to copy the whole page and write it to a new location on a disk or flash drive. Instead, we can specify one of several versioning strategies known from backup systems or a novel sliding snapshot algorithm during the creation of a database resource. The versioning-type we specify is used by SirixDB to version data-pages.
Guaranteed atomicity and consistency (without a WAL): The system will never enter an inconsistent state (unless there is hardware failure), meaning that unexpected power-off won't ever damage the system. This is accomplished without the overhead of a write-ahead-log. (WAL)
Log-structured and SSD friendly: SirixDB batches writes and syncs everything sequentially to a flash drive during commits. It never overwrites committed data.

Revision Histories

Keeping the revision history is one of the main features in SirixDB. You can revert any revision into an earlier version or back up the system automatically without the overhead of copying. SirixDB only ever copies changed database-pages and, depending on the versioning algorithm you chose during the creation of a database/resource, only page-fragments, and ancestor index-pages to create a new revision.

You can reconstruct every revision in O(n), where n denotes the number of nodes in the revision. Binary search is used on an in-memory (linked) map to load the revision, thus finding the revision root page has an asymptotic runtime complexity of O(log n), where n, in this case, is the number of stored revisions.

Currently, SirixDB offers two built-in native data models, namely a binary XML store and a JSON store.

Articles published on Medium:

Getting started

Download ZIP or Git Clone

git clone https://github.com/sirixdb/sirix.git

or use the following dependencies in your Maven or Gradle project.

SirixDB uses Java15, thus you need an up-to-date Gradle (if you want to work on SirixDB) and IntelliJ or Eclipse.

Maven artifacts

At this stage of development, you should use the latest SNAPSHOT artifacts from the OSS snapshot repository to get the most recent changes.

Just add the following repository section to your POM or build.gradle file:

<repository>
  <id>sonatype-nexus-snapshots</id>
  <name>Sonatype Nexus Snapshots</name>
  <url>https://oss.sonatype.org/content/repositories/snapshots</url>
  <releases>
    <enabled>false</enabled>
  </releases>
  <snapshots>
    <enabled>true</enabled>
  </snapshots>
</repository>

repository {
    maven {
        url "https://oss.sonatype.org/content/repositories/snapshots/"
        mavenContent {
            snapshotsOnly()
        }
    }
}

Note that we changed the groupId from com.github.sirixdb.sirix to io.sirix. Most recent version is 0.9.6-SNAPSHOT.

Maven artifacts are deployed to the central maven repository (however please use the SNAPSHOT-variants as of now). Currently, the following artifacts are available:

Core project:

<dependency>
  <groupId>io.sirix</groupId>
  <artifactId>sirix-core</artifactId>
  <version>0.9.6-SNAPSHOT</version>
</dependency>

compile group:'io.sirix', name:'sirix-core', version:'0.9.6-SNAPSHOT'

Brackit binding:

<dependency>
  <groupId>io.sirix</groupId>
  <artifactId>sirix-xquery</artifactId>
  <version>0.9.6-SNAPSHOT</version>
</dependency>

compile group:'io.sirix', name:'sirix-xquery', version:'0.9.6-SNAPSHOT'

Asynchronous, RESTful API with Vert.x, Kotlin and Keycloak (the latter for authentication via OAuth2/OpenID-Connect):

<dependency>
  <groupId>io.sirix</groupId>
  <artifactId>sirix-rest-api</artifactId>
  <version>0.9.4-SNAPSHOT</version>
</dependency>

compile group: 'io.sirix', name: 'sirix-rest-api', version: '0.9.6-SNAPSHOT'

Other modules are currently not available (namely the GUI, the distributed package as well as an outdated Saxon binding).

Setup of the SirixDB HTTP-Server and Keycloak to use the REST-API

The REST-API is asynchronous at its very core. We use Vert.x, which is a toolkit built on top of Netty. It is heavily inspired by Node.js but for the JVM. As such, it uses event loop(s), which is thread(s), which never should by blocked by long-running CPU tasks or disk-bound I/O. We are using Kotlin with coroutines to keep the code simple. SirixDB uses OAuth2 (Password Credentials/Resource Owner Flow) using a Keycloak authorization server instance.

Start Docker Keycloak-Container using docker-compose

For setting up the SirixDB HTTP-Server and a basic Keycloak-instance with a test realm:

git clone https://github.com/sirixdb/sirix.git
sudo docker-compose up keycloak

Keycloak setup

You can set up Keycloak as described in this excellent tutorial. Our docker-compose file imports a sirix realm with a default admin user with all available roles assigned. You can skip steps 3 - 7 and 10, 11, and simply recreate a client-secret and change oAuthFlowType to "PASSWORD". If you want to run or modify the integration tests, the client secret must not be changed. Make sure to delete the line "build: ." in the docker-compse.yml file for the server image if you want to use the Docker Hub image.

Open your browser. URL: http://localhost:8080
Login with username "admin", password "admin"
Create a new realm with the name "sirixdb"
Go to Clients => account
Change client-id to "sirix"
Make sure access-type is set to confidential
Go to Credentials tab
Put the client secret into the SirixDB HTTP-Server configuration file. Change the value of "client.secret" to whatever Keycloak set up.
If "oAuthFlowType" is specified in the ame configuration file change the value to "PASSWORD" (if not default is "PASSWORD").
Regarding Keycloak the direct access grant on the settings tab must be enabled.
Our (user-/group-)roles are "create" to allow creating databases/resources, "view" to allow to query database resources, "modify" to modify a database resource and "delete" to allow deletion thereof. You can also assign ${databaseName}- prefixed roles.

Start the SirixDB HTTP-Server and the Keycloak-Container using docker-compose

The following command will start the docker container

sudo docker-compose up

SirixDB HTTP-Server Setup Without Docker/docker-compose

To created a fat-JAR. Download our ZIP-file for instance, then

cd bundles/sirix-rest-api
gradle build -x test

And a fat-JAR with all required dependencies should have been created in your target folder.

Furthermore, a key.pem and a cert.pem file are needed. These two files have to be in your user home directory in a directory called "sirix-data", where Sirix stores the databases. For demo purposes they can be copied from our resources directory.

Once also Keycloak is set up we can start the server via:

java -jar -Duser.home=/opt/sirix sirix-rest-api-*-SNAPSHOT-fat.jar -conf sirix-conf.json -cp /opt/sirix/*

If you like to change your user home directory to /opt/sirix for instance.

The fat-JAR in the future will be downloadable from the maven repository.

Run the Integration Tests

In order to run the integration tests under bundles/sirix-rest-api/src/test/kotlin make sure that you assign your admin user all the user-roles you have created in the Keycloak setup (last step). Make sure that Keycloak is running first and execute the tests in your favorite IDE for instance.

Note that the following VM-parameters currently are needed: -ea --enable-preview --add-modules=jdk.incubator.foreign

Command-line tool

We ship a (very) simple command-line tool for the sirix-xquery bundle:

Get the latest sirix-xquery JAR with dependencies.

Documentation

We are currently working on the documentation. You may find first drafts and snippets in the documentation and in this README. Furthermore, you are kindly invited to ask any question you might have (and you likely have many questions) in the community forum (preferred) or in the Slack channel. Please also have a look at and play with our sirix-example bundle which is available via maven or our new asynchronous RESTful API (shown next).

Getting Help

Community Forum

If you have any questions or are considering to contribute or use Sirix, please use the Community Forum to ask questions. Any kind of question, may it be an API-question or enhancement proposal, questions regarding use-cases are welcome... Don't hesitate to ask questions or make suggestions for improvements. At the moment also API-related suggestions and critics are of utmost importance.

Join us on Slack

You may find us on Slack for quick questions.

Contributors ✨

SirixDB is maintained by

Johannes Lichtenberger

And the Open Source Community.

As the project was forked from a university project called Treetank, my deepest gratitude to Marc Kramis, who came up with the idea of building a versioned, secure and energy-efficient data store, which retains the history of resources of his Ph.D. Furthermore, Sebastian Graf came up with a lot of ideas and greatly improved the implementation for his Ph.D. Besides, a lot of students worked and improved the project considerably.

Thanks goes to these wonderful people, who greatly improved SirixDB lately. SirixDB couldn't exist without the help of the Open Source community:

_{Ilias YAHIA} 💻	_{BirokratskaZila} 📖	_{Andrei Buiza} 💻	_{Bondar Dmytro} 💻	_{santoshkumarkannur} 📖	_{Lars Eckart} 💻	_{Jayadeep K M} 📆
_{Keith Kim} 🎨	_{Theofanis Despoudis} 📖	_{Mario Iglesias Alarcón} 🎨	_{Antonio Nuno Monteiro} 📆	_{Fulton Browne} 📖	_{Felix Rabe} 📖	_{Ethan Willis} 📖
_{Erik Axelsson} 💻	_{Sérgio Batista} 📖	_chaensel 📖	_{Balaji Vijayakumar} 💻	_{Fernanda Campos} 💻	_{Joel Lau} 💻	_add09 💻
_{Emil Gedda} 💻	_{Andreas Rohlén} 💻	_{Marcin Bielecki} 💻	_{Manfred Nentwig} 💻	_Raj 💻	_{Moshe Uminer} 💻

Contributions of any kind are highly welcome!

License

This work is released under the BSD 3-clause license.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 638

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (69) 🔗

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

sirixdb / Sirix

Programming Languages

Labels

Projects that are alternatives of or similar to Sirix

An Evolutionary, Accumulate-Only Database System

Table of contents

Keeping All Versions of Your Data By Sharing Structure

SirixDB Features

Design Goals

Revision Histories

Getting started

Download ZIP or Git Clone

Maven artifacts

Setup of the SirixDB HTTP-Server and Keycloak to use the REST-API

Start Docker Keycloak-Container using docker-compose

Keycloak setup

Start the SirixDB HTTP-Server and the Keycloak-Container using docker-compose

SirixDB HTTP-Server Setup Without Docker/docker-compose

Run the Integration Tests

Command-line tool

Documentation

Getting Help

Community Forum

Join us on Slack

Contributors ✨

License