Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Inshop CRM / ERP API. It's powerful framework allows to build systems for business with different workflows. It has on board multi language support, clients management, projects & tasks, documents, simple accounting, inventory management, orders & invoice management, possibilities to integrate with third party software, REST API, and many other features.

Stars: ✭ 178 (-13.17%)

Mutual labels: postgresql, elasticsearch

Steampipe

Steampipe command line interface (CLI)

Stars: ✭ 200 (-2.44%)

Mutual labels: sql, postgresql

Amazonriver

amazonriver 是一个将postgresql的实时数据同步到es或kafka的服务

Stars: ✭ 198 (-3.41%)

Mutual labels: postgresql, elasticsearch

Sql Battleships

Play Battleships on PostgreSQL

Stars: ✭ 174 (-15.12%)

Mutual labels: sql, postgresql

Supra Api Nodejs

❤️ Node.js REST API boilerplate

Stars: ✭ 182 (-11.22%)

Mutual labels: sql, postgresql

Usaspending Api

Server application to serve U.S. federal spending data via a RESTful API

Stars: ✭ 166 (-19.02%)

Mutual labels: postgresql, elasticsearch

Sqlingvo

A Clojure & ClojureScript DSL for SQL

Stars: ✭ 200 (-2.44%)

Mutual labels: sql, postgresql

Firecamp

Serverless Platform for the stateful services

Stars: ✭ 194 (-5.37%)

Mutual labels: postgresql, elasticsearch

Sql exporter

Flexible SQL Exporter for Prometheus

Stars: ✭ 194 (-5.37%)

Mutual labels: sql, postgresql

Linq2db

Linq to database provider.

Stars: ✭ 2,211 (+978.54%)

Mutual labels: sql, postgresql

Rom Sql

SQL support for rom-rb

Stars: ✭ 169 (-17.56%)

Mutual labels: sql, postgresql

Xsql

Unified SQL Analytics Engine Based on SparkSQL

Stars: ✭ 176 (-14.15%)

Mutual labels: sql, elasticsearch

Sqlcheck

Automatically identify anti-patterns in SQL queries

Stars: ✭ 2,062 (+905.85%)

Mutual labels: sql, postgresql

Nut

Advanced, Powerful and easy to use ORM for Qt

Stars: ✭ 181 (-11.71%)

Mutual labels: sql, postgresql

Npgsql

Npgsql is the .NET data provider for PostgreSQL.

Stars: ✭ 2,415 (+1078.05%)

Mutual labels: sql, postgresql

Pifpaf

Python fixtures and daemon managing tools for functional testing

Stars: ✭ 161 (-21.46%)

Mutual labels: postgresql, elasticsearch

Neo4j Etl

Data import from relational databases to Neo4j.

Stars: ✭ 165 (-19.51%)

Mutual labels: sql, postgresql

View All Similar Projects ➔

PGSync

PostgreSQL to Elasticsearch sync

PGSync is a middleware for syncing data from Postgres to Elasticsearch effortlessly. It allows you to keep Postgres as your source of truth and expose structured denormalized documents in Elasticsearch.

Changes to nested entities are propagated to Elasticsearch. PGSync's advanced query builder then generates optimized SQL queries on the fly based on your schema. PGSync's advisory model allows you to quickly move and transform large volumes of data quickly whilst maintaining relational integrity.

Simply describe your document structure or schema in JSON and PGSync will continuously capture changes in your data and load it into Elasticsearch without writing any code. PGSync transforms your relational data into a structured document format.

It allows you to take advantage of the expressive power and scalability of Elasticsearch directly from Postgres. You don't have to write complex queries and transformation pipelines. PGSync is lightweight, flexible and fast.

Elasticsearch is more suited as as secondary denormalised search engine to accompany a more traditional normalized datastore. Moreover, you shouldn't store your primary data in Elasticsearch.

So how do you then get your data into Elasticsearch in the first place? Tools like Logstash and Kafka can aid this task but they still require a bit of engineering and development.

Extract Transform Load and Change data capture tools can be complex and require expensive engineering effort.

Other benefits of PGSync include:

Real-time analytics
Reliable primary datastore/source of truth
Scale on-demand
Easily join multiple nested tables

PGSync Architecture:

Why?

At a high level, you have data in a Postgres database and you want to mirror it in Elasticsearch.
This means every change to your data (Insert, Update, Delete and Truncate statements) needs to be replicated to Elasticsearch. At first, this seems easy and then it's not. Simply add some code to copy the data to Elasticsearch after updating the database (or so called dual writes). Writing SQL queries spanning multiple tables and involving multiple relationships are hard to write. Detecting changes within a nested document can also be quite hard. Of course, if your data never changed, then you could just take a snapshot in time and load it into Elasticsearch as a one-off operation.

PGSync is appropriate for you if:

Postgres is your read/write source of truth whilst Elasticsearch is your read-only search layer.
You need to denormalize relational data into a NoSQL data source.
Your data is constantly changing.
You have existing data in a relational database such as Postgres and you need a secondary NoSQL database like Elasticsearch for text-based queries or autocomplete queries to mirror the existing data without having your application perform dual writes.
You want to keep your existing data untouched whilst taking advantage of the search capabilities of Elasticsearch by exposing a view of your data without compromising the security of your relational data.
Or you simply want to expose a view of your relational data for search purposes.

How it works

PGSync is written in Python (supporting version 3.6 onwards) and the stack is composed of: Redis, Elasticsearch, Postgres, and SQlAlchemy.

PGSync leverages the logical decoding feature of Postgres (introduced in PostgreSQL 9.4) to capture a continuous stream of change events. This feature needs to be enabled in your Postgres configuration file by setting in the postgresql.conf file:

> wal_level = logical

You can select any pivot table to be the root of your document.

PGSync's query builder builds advanced queries dynamically against your schema.

PGSync operates in an event-driven model by creating triggers for tables in your database to handle notification events.

This is the only time PGSync will ever make any changes to your database.

NOTE: If you change the structure of your PGSync's schema config, you would need to rebuild your Elasticsearch indices. There are plans to support zero-downtime migrations to streamline this process.

Quickstart

There are several ways of installing and trying PGSync

Running in Docker is the easiest way to get up and running.
Manual configuration

Running in Docker

To startup all services with docker. Run:

$ docker-compose up

Show the content in Elasticsearch

$ curl -X GET http://[elasticsearch host]:9201/reservations/_search?pretty=true

Manual configuration

Setup
- Ensure the database user is a superuser
- Enable logical decoding. You would also need to set up at least two parameters at postgresql.conf
  
  wal_level = logical
  
  max_replication_slots = 1
Installation
- $ pip install pgsync
- Create a schema.json for you document representation
- Bootstrap the database (one time only) bootstrap --config schema.json
- Run the program with pgsync --config schema.json or as a daemon pgsync --config schema.json -d

Features

Key features of PGSync are:

Easily denormalize relational data.
Works with any PostgreSQL database (version 9.4 or later).
Negligible impact on database performance.
Transactionally consistent output in Elasticsearch. This means: writes appear only when they are committed to the database, insert, update and delete operations appear in the same order as they were committed (as opposed to eventual consistency).
Fault-tolerant: does not lose data, even if processes crash or a network interruption occurs, etc. The process can be recovered from the last checkpoint.
Returns the data directly as Postgres JSON from the database for speed.
Supports composite primary and foreign keys.
Supports an arbitrary depth of nested entities i.e Tables having long chain of relationship dependencies.
Supports Postgres JSON data fields. This means: we can extract JSON fields in a database table as a separate field in the resulting document.
Customizable document structure.

Requirements

Python 3.6+
Postgres 9.4+
Redis 3.1.0
Elasticsearch 6.3.1+
SQlAlchemy 1.3.4+

Example

Consider this example of a Book library database.

Book

isbn (PK)	title	description
9785811243570	Charlie and the chocolate factory	Willy Wonka’s famous chocolate factory is opening at last!
9788374950978	Kafka on the Shore	Kafka on the Shore is a 2002 novel by Japanese author Haruki Murakami.
9781471331435	1984	1984 was George Orwell’s chilling prophecy about the dystopian future.

Author

id (PK)	name
1	Roald Dahl
2	Haruki Murakami
3	Philip Gabriel
4	George Orwell

BookAuthor

id (PK)	book_isbn	author_id
1	9785811243570	1
2	9788374950978	2
3	9788374950978	3
4	9781471331435	4

With PGSync, we can simply define this JSON schema where the book table is the pivot. A pivot table indicates the root of your document.

{
    "table": "book",
    "columns": [
        "isbn",
        "title",
        "description"
    ],
    "children": [
        {
            "table": "author",
            "columns": [
                "name"
            ]
        }
    ]
}

To get this document structure in Elasticsearch

[
  {
      "isbn": "9785811243570",
      "title": "Charlie and the chocolate factory",
      "description": "Willy Wonka’s famous chocolate factory is opening at last!",
      "authors": ["Roald Dahl"]
  },
  {
      "isbn": "9788374950978",
      "title": "Kafka on the Shore",
      "description": "Kafka on the Shore is a 2002 novel by Japanese author Haruki Murakami",
      "authors": ["Haruki Murakami", "Philip Gabriel"]
  },
  {
      "isbn": "9781471331435",
      "title": "1984",
      "description": "1984 was George Orwell’s chilling prophecy about the dystopian future",
      "authors": ["George Orwell"]
  }
]

Behind the scenes, PGSync is generating advanced queries for you such as.

SELECT 
       JSON_BUILD_OBJECT(
          'isbn', book_1.isbn, 
          'title', book_1.title, 
          'description', book_1.description,
          'authors', anon_1.authors
       ) AS "JSON_BUILD_OBJECT_1",
       book_1.id
FROM book AS book_1
LEFT OUTER JOIN
  (SELECT 
          JSON_AGG(anon_2.anon) AS authors,
          book_author_1.book_isbn AS book_isbn
   FROM book_author AS book_author_1
   LEFT OUTER JOIN
     (SELECT 
             author_1.name AS anon,
             author_1.id AS id
      FROM author AS author_1) AS anon_2 ON anon_2.id = book_author_1.author_id
   GROUP BY book_author_1.book_isbn) AS anon_1 ON anon_1.book_isbn = book_1.isbn

You can also configure PGSync to rename attributes via the schema config e.g

  {
      "isbn": "9781471331435",
      "this_is_a_custom_title": "1984",
      "desc": "1984 was George Orwell’s chilling prophecy about the dystopian future",
      "contributors": ["George Orwell"]
  }

PGSync addresses the following challenges:

What if we update the author's name in the database?
What if we wanted to add another author for an existing book?
What if we have lots of documents already with the same author we wanted to change the author name?
What if we delete or update an author?
What if we truncate an entire table?

Benefits

PGSync is a simple to use out of the box solution for Change data capture.
PGSync handles data deletions.
PGSync requires little development effort. You simply define a schema config describing your data.
PGSync generates advanced queries matching your schema directly.
PGSync allows you to easily rebuild your indexes in case of a schema change.
You can expose only the data you require in Elasticsearch.
Supports multiple Postgres schemas for multi-tennant applications.

Contributing

Contributions are very welcome! Check out the Contribution Guidelines for instructions.

Credits

This package was created with Cookiecutter
Elasticsearch is a trademark of Elasticsearch BV, registered in the U.S. and in other countries.

License

This code is released under the GNU Lesser General Public License, version 3.0 (LGPL-3.0).
Please see LICENSE for more details.

You should have received a copy of the GNU Lesser General Public License along with PGSync.
If not, see https://www.gnu.org/licenses/.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 205

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (14) 🔗