All Projects → allegro → bigflow

allegro / bigflow

Licence: other
A Python framework for data processing on GCP.

Programming Languages

python
139335 projects - #7 most used programming language
Jinja
831 projects
Dockerfile
14818 projects

Projects that are alternatives of or similar to bigflow

Gcp Variant Transforms
GCP Variant Transforms
Stars: ✭ 100 (+4.17%)
Mutual labels:  bigquery, beam, dataflow
Scio
A Scala API for Apache Beam and Google Cloud Dataflow.
Stars: ✭ 2,247 (+2240.63%)
Mutual labels:  bigquery, beam, dataflow
argon
Campaign Manager 360 and Display & Video 360 Reports to BigQuery connector
Stars: ✭ 31 (-67.71%)
Mutual labels:  bigquery, gcp
astro
Astro allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
Stars: ✭ 79 (-17.71%)
Mutual labels:  bigquery, workflows
Ethereum Etl
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ
Stars: ✭ 956 (+895.83%)
Mutual labels:  bigquery, gcp
iris3
An upgraded and improved version of the Iris automatic GCP-labeling project
Stars: ✭ 38 (-60.42%)
Mutual labels:  bigquery, gcp
bigquery-to-datastore
Export a whole BigQuery table to Google Datastore with Apache Beam/Google Dataflow
Stars: ✭ 56 (-41.67%)
Mutual labels:  bigquery, beam
snowplow-bigquery-loader
Loads Snowplow enriched events into Google BigQuery
Stars: ✭ 15 (-84.37%)
Mutual labels:  bigquery, gcp
Airflow Toolkit
Any Airflow project day 1, you can spin up a local desktop Kubernetes Airflow environment AND one in Google Cloud Composer with tested data pipelines(DAGs) 🖥 >> [ 🚀, 🚢 ]
Stars: ✭ 51 (-46.87%)
Mutual labels:  composer, gcp
Ethereum Etl Airflow
Airflow DAGs for exporting, loading, and parsing the Ethereum blockchain data. What datasets do you want to be added to Ethereum ETL? Vote here: https://blockchain-etl.convas.io.
Stars: ✭ 89 (-7.29%)
Mutual labels:  bigquery, gcp
Bitcoin Etl
ETL scripts for Bitcoin, Litecoin, Dash, Zcash, Doge, Bitcoin Cash. Available in Google BigQuery https://goo.gl/oY5BCQ
Stars: ✭ 174 (+81.25%)
Mutual labels:  bigquery, gcp
hive-bigquery-storage-handler
Hive Storage Handler for interoperability between BigQuery and Apache Hive
Stars: ✭ 16 (-83.33%)
Mutual labels:  bigquery, gcp
DataflowTemplates
Convenient Dataflow pipelines for transforming data between cloud data sources
Stars: ✭ 22 (-77.08%)
Mutual labels:  bigquery, dataflow
etlflow
EtlFlow is an ecosystem of functional libraries in Scala based on ZIO for writing various different tasks, jobs on GCP and AWS.
Stars: ✭ 38 (-60.42%)
Mutual labels:  bigquery, gcp
polygon-etl
ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (-44.79%)
Mutual labels:  bigquery, gcp
gcp-ml
Google Cloud Platform Machine Learning Samples
Stars: ✭ 31 (-67.71%)
Mutual labels:  bigquery, gcp
terraform-splunk-log-export
Deploy Google Cloud log export to Splunk using Terraform
Stars: ✭ 26 (-72.92%)
Mutual labels:  gcp, dataflow
dataflow-fsi-example
Using Google Cloud, this project is an example of how to detect anomalies in financial, technical indicators by modeling their expected distribution and thus inform when the Relative Strength Indicator (RSI) is unreliable.
Stars: ✭ 26 (-72.92%)
Mutual labels:  gcp, dataflow
Datashare Toolkit
DIY commercial datasets on Google Cloud Platform
Stars: ✭ 41 (-57.29%)
Mutual labels:  bigquery, gcp
bigquery-data-lineage
Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.
Stars: ✭ 112 (+16.67%)
Mutual labels:  bigquery, dataflow

BigFlow

Documentation

  1. What is BigFlow?
  2. Getting started
  3. Installing Bigflow
  4. Help me
  5. BigFlow tutorial
  6. CLI
  7. Configuration
  8. Project structure and build
  9. Deployment
  10. Workflow & Job
  11. Starter
  12. Technologies
  13. Development

Cookbook

What is BigFlow?

BigFlow is a Python framework for data processing pipelines on GCP.

The main features are:

Getting started

Start from installing BigFlow on your local machine. Next, go through the BigFlow tutorial.

Installing BigFlow

Prerequisites. Before you start, make sure you have the following software installed:

  1. Python >= 3.7
  2. Google Cloud SDK
  3. Docker Engine

You can install the bigflow package globally, but we recommend installing it locally with venv, in your project's folder:

python -m venv .bigflow_env
source .bigflow_env/bin/activate

Install the bigflow PIP package:

pip install bigflow[bigquery,dataflow]

Test it:

bigflow -h

Read more about BigFlow CLI.

To interact with GCP you need to set a default project and log in:

gcloud config set project <your-gcp-project-id>
gcloud auth application-default login

Finally, check if your Docker is running:

docker info

Help me

You can ask questions on our gitter channel or stackoverflow.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].