All Projects → snowplow-incubator → snowplow-bigquery-loader

snowplow-incubator / snowplow-bigquery-loader

Licence: other
Loads Snowplow enriched events into Google BigQuery

Programming Languages

scala
5932 projects

Projects that are alternatives of or similar to snowplow-bigquery-loader

Bitcoin Etl
ETL scripts for Bitcoin, Litecoin, Dash, Zcash, Doge, Bitcoin Cash. Available in Google BigQuery https://goo.gl/oY5BCQ
Stars: ✭ 174 (+1060%)
Mutual labels:  bigquery, gcp
polygon-etl
ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (+253.33%)
Mutual labels:  bigquery, gcp
flight2bq
RTLSDR ADS-B dump1090 to Google BigQuery
Stars: ✭ 33 (+120%)
Mutual labels:  bigquery, google-bigquery
Ethereum Etl
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ
Stars: ✭ 956 (+6273.33%)
Mutual labels:  bigquery, gcp
etlflow
EtlFlow is an ecosystem of functional libraries in Scala based on ZIO for writing various different tasks, jobs on GCP and AWS.
Stars: ✭ 38 (+153.33%)
Mutual labels:  bigquery, gcp
Ethereum Etl Airflow
Airflow DAGs for exporting, loading, and parsing the Ethereum blockchain data. What datasets do you want to be added to Ethereum ETL? Vote here: https://blockchain-etl.convas.io.
Stars: ✭ 89 (+493.33%)
Mutual labels:  bigquery, gcp
bigflow
A Python framework for data processing on GCP.
Stars: ✭ 96 (+540%)
Mutual labels:  bigquery, gcp
Datashare Toolkit
DIY commercial datasets on Google Cloud Platform
Stars: ✭ 41 (+173.33%)
Mutual labels:  bigquery, gcp
iris3
An upgraded and improved version of the Iris automatic GCP-labeling project
Stars: ✭ 38 (+153.33%)
Mutual labels:  bigquery, gcp
blockchain-etl-streaming
Streaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes
Stars: ✭ 57 (+280%)
Mutual labels:  gcp, google-bigquery
hive-bigquery-storage-handler
Hive Storage Handler for interoperability between BigQuery and Apache Hive
Stars: ✭ 16 (+6.67%)
Mutual labels:  bigquery, gcp
gcp-ml
Google Cloud Platform Machine Learning Samples
Stars: ✭ 31 (+106.67%)
Mutual labels:  bigquery, gcp
objectiv-analytics
Powerful product analytics for data teams, with full control over data & models.
Stars: ✭ 399 (+2560%)
Mutual labels:  bigquery, snowplow
argon
Campaign Manager 360 and Display & Video 360 Reports to BigQuery connector
Stars: ✭ 31 (+106.67%)
Mutual labels:  bigquery, gcp
laravel-big
Google BigQuery for Laravel
Stars: ✭ 14 (-6.67%)
Mutual labels:  bigquery, google-bigquery
dbd
dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.
Stars: ✭ 30 (+100%)
Mutual labels:  bigquery
gke-vault-demo
This demo builds two GKE Clusters and guides you through using secrets in Vault, using Kubernetes authentication from within a pod to login to Vault, and fetching short-lived Google Service Account credentials on-demand from Vault within a pod.
Stars: ✭ 63 (+320%)
Mutual labels:  gcp
gcp-class-1
Google Cloud class 1
Stars: ✭ 14 (-6.67%)
Mutual labels:  gcp
ob google-bigquery
This service is meant to simplify running Google Cloud operations, especially BigQuery tasks. This means you do not have to worry about installation, configuration or ongoing maintenance related to an SDK environment. This can be helpful to those who would prefer to not to be responsible for those activities.
Stars: ✭ 43 (+186.67%)
Mutual labels:  bigquery
typed-css-modules-loader
💠 Webpack loader for typed-css-modules auto-creation
Stars: ✭ 62 (+313.33%)
Mutual labels:  loader

Snowplow BigQuery Loader

Build Status Release License

This project contains applications used to load Snowplow enriched data into Google BigQuery.

Quickstart

Assuming git and SBT installed:

$ git clone https://github.com/snowplow-incubator/snowplow-bigquery-loader
$ cd snowplow-bigquery-loader
$ sbt "project loader" test
$ sbt "project streamloader" test
$ sbt "project mutator" test
$ sbt "project repeater" test

Benchmarks

This project comes with sbt-jmh.

To run a specific benchmark test:

$ sbt 'project benchmark' '+jmh:run -i 20 -wi 10 -f2 -t3 .*TransformAtomic.*'

Or, to run all benchmark tests (once more are added):

$ sbt 'project benchmark' '+jmh:run -i 20 -wi 10 -f2 -t3'

The number of warm-ups and iterations is what the sbt-jmh project recommends but they can be lowered for faster runs.

To see all sbt-jmh options: jmh:run -h.

Add new benchmarks to this module.

Building fatjars

You can build the jar files for Mutator, Repeater and Streamloader with sbt like so:

$ sbt clean 'project mutator' assembly
$ sbt clean 'project repeater' assembly
$ sbt clean 'project streamloader' assembly

Find out more

Technical Docs Setup Guide Contributing
i1 i2 i3

Copyright and License

Snowplow BigQuery Loader is copyright 2018-2022 Snowplow Analytics Ltd.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this software except in compliance with the License.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].