Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → Powerspace → Pg2bq

Powerspace / Pg2bq

Licence: mit

Export PostgreSQL tables to Google BigQuery

Programming Languages

5932 projects

Labels

postgresql bigquery

Projects that are alternatives of or similar to Pg2bq

DDL parase and Convert to BigQuery JSON schema and DDL statements

Stars: ✭ 52 (+73.33%)

Mutual labels: bigquery, postgresql

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

Stars: ✭ 20,147 (+67056.67%)

Mutual labels: bigquery, postgresql

Run templatable playbooks of SQL scripts in series and parallel on Redshift, PostgreSQL, BigQuery and Snowflake

Stars: ✭ 68 (+126.67%)

Mutual labels: bigquery, postgresql

🍟 a notebook sql client. what you get when have a lot of sequels.

Stars: ✭ 3,823 (+12643.33%)

Mutual labels: bigquery, postgresql

📊 Cube — Open-Source Analytics API for Building Data Apps

Stars: ✭ 11,983 (+39843.33%)

Mutual labels: bigquery, postgresql

tbls is a CI-Friendly tool for document a database, written in Go.

Stars: ✭ 940 (+3033.33%)

Mutual labels: bigquery, postgresql

Copostgresquery

Query builder for PostgreSQL in Node.js, built for async

Stars: ✭ 12 (-60%)

Mutual labels: postgresql

Pgsql Cluster Manager

Daemon and migration tool that manages Postgres cluster using etcd/corosync/pacemaker

Stars: ✭ 15 (-50%)

Mutual labels: postgresql

Netkiller.github.io

Netkiller Free ebook - 免费电子书

Stars: ✭ 861 (+2770%)

Mutual labels: postgresql

An opinionated Koa starter kit

Stars: ✭ 10 (-66.67%)

Mutual labels: postgresql

The ZeeQL (EOF/CoreData/AR like) Database Toolkit for Swift

Stars: ✭ 29 (-3.33%)

Mutual labels: postgresql

not just an rss reader

Stars: ✭ 21 (-30%)

Mutual labels: postgresql

Express Starter

Express Starter

Stars: ✭ 14 (-53.33%)

Mutual labels: postgresql

Add table-level database locking to ActiveRecord

Stars: ✭ 12 (-60%)

Mutual labels: postgresql

Treefrog Framework

TreeFrog Framework : High-speed C++ MVC Framework for Web Application

Stars: ✭ 885 (+2850%)

Mutual labels: postgresql

Crunchy Containers

Containers for Managing PostgreSQL on Kubernetes by Crunchy Data

Stars: ✭ 870 (+2800%)

Mutual labels: postgresql

SQL generator and result set mapper for Postgres and Kotlin

Stars: ✭ 21 (-30%)

Mutual labels: postgresql

Open-source subscription management & billing automation system

Stars: ✭ 857 (+2756.67%)

Mutual labels: postgresql

Node.js application - simple notes management using Express, Postgres, Objection.js, Docker, Socket.io, Bluebird Promises

Stars: ✭ 14 (-53.33%)

Mutual labels: postgresql

Postgresql Postgis Timescaledb

PostgreSQL + PostGIS + TimescaleDB docker image 🐘🌎📈

Stars: ✭ 19 (-36.67%)

Mutual labels: postgresql

View All Similar Projects ➔

pg2bq

pg2bq is made for one thing: export tables from PostgreSQL to Google BigQuery.

Why ?

It's useful to keep the data at both places "in-sync" (using cron, airflow, or whatever to schedule the export on a regular basis). If your metadata are on PostgreSQL, but your realtime data are in BigQuery, it's probable you want to join them.

Note: internally, it is using the Spark framework for the sake of simplicity, but no Hadoop cluster is needed. It is configured as a "local" cluster by default, meaning the application is running standalone.

How to run it

Download the release made on GitHub: pg2bq-1.0.3.zip
Create a Service Account to Google Cloud which has access to GCS and BigQuery, and create a json key
Create a configuration file configuration.conf for pg2bq to know where to grab and put the data (HOCON):

jdbc {
  url = "jdbc:postgresql://mypg:5432/mydb"
  user = "myuser"
  password = "mypwd"
  tables = [ "user", "campaign", "website" ]
}

gcloud {
  project = "gcloud-project-id"
  service-account-key-path = "/path/to/service-account-key.json"
  bq.dataset = "mypg"
  gcs.tmp-bucket = "pg-export-tmp"
}

Run the application specifying the config file:

GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account-key.json ./bin/pg2db -Dconfig.file=configuration.conf

Done!

Add this to a scheduler every 10min and enjoy your JOINs in BigQuery.

What does it do exactly ?

It exports the data from the tables into DataFrames
It saves them into GCS as .avro to keep the schema along the data: this will avoid to specify/create the BigQuery table schema beforehands.
It starts BigQuery jobs to import those .avro into the respective BigQuery tables.

Development

To run the application in dev mode, create a proper application.conf and run the app:

via the IDE, just run Main
via sbt, by pre-packaging the whole thing:

$ sbt stage && ./target/universal/stage/bin/pg2bq

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 30

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (7) 🔗