All Projects → GoogleCloudPlatform → Datashare Toolkit

GoogleCloudPlatform / Datashare Toolkit

Licence: apache-2.0
DIY commercial datasets on Google Cloud Platform

Programming Languages

javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to Datashare Toolkit

Ethereum Etl
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ
Stars: ✭ 956 (+2231.71%)
Mutual labels:  gcp, bigquery
hive-bigquery-storage-handler
Hive Storage Handler for interoperability between BigQuery and Apache Hive
Stars: ✭ 16 (-60.98%)
Mutual labels:  bigquery, gcp
Ethereum Etl Airflow
Airflow DAGs for exporting, loading, and parsing the Ethereum blockchain data. What datasets do you want to be added to Ethereum ETL? Vote here: https://blockchain-etl.convas.io.
Stars: ✭ 89 (+117.07%)
Mutual labels:  gcp, bigquery
bigflow
A Python framework for data processing on GCP.
Stars: ✭ 96 (+134.15%)
Mutual labels:  bigquery, gcp
etlflow
EtlFlow is an ecosystem of functional libraries in Scala based on ZIO for writing various different tasks, jobs on GCP and AWS.
Stars: ✭ 38 (-7.32%)
Mutual labels:  bigquery, gcp
gcp-ml
Google Cloud Platform Machine Learning Samples
Stars: ✭ 31 (-24.39%)
Mutual labels:  bigquery, gcp
Bitcoin Etl
ETL scripts for Bitcoin, Litecoin, Dash, Zcash, Doge, Bitcoin Cash. Available in Google BigQuery https://goo.gl/oY5BCQ
Stars: ✭ 174 (+324.39%)
Mutual labels:  gcp, bigquery
polygon-etl
ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (+29.27%)
Mutual labels:  bigquery, gcp
iris3
An upgraded and improved version of the Iris automatic GCP-labeling project
Stars: ✭ 38 (-7.32%)
Mutual labels:  bigquery, gcp
argon
Campaign Manager 360 and Display & Video 360 Reports to BigQuery connector
Stars: ✭ 31 (-24.39%)
Mutual labels:  bigquery, gcp
snowplow-bigquery-loader
Loads Snowplow enriched events into Google BigQuery
Stars: ✭ 15 (-63.41%)
Mutual labels:  bigquery, gcp
Ultimate Metatags
A large snippet for your page's <head> that includes all the meta tags you'll need for OPTIMAL sharing and SEO. Extensive work has been put into ensuring you have the optimal images for the most important social media platforms.
Stars: ✭ 24 (-41.46%)
Mutual labels:  sharing
Capella Tray
Upload screenshots instantly to Capella and get link directly to clipboard
Stars: ✭ 23 (-43.9%)
Mutual labels:  sharing
Ever
Ever® - Open-Source Commerce Platform for On-Demand Economy and Digital Marketplaces
Stars: ✭ 980 (+2290.24%)
Mutual labels:  marketplace
Spring Petclinic Gcp
Spring PetClinic Microservices on GCP
Stars: ✭ 22 (-46.34%)
Mutual labels:  gcp
Openfaas Gke
Running OpenFaaS on Google Kubernetes Engine
Stars: ✭ 30 (-26.83%)
Mutual labels:  gcp
Fsfirestore
Functional F# library to access Firestore database hosted on Google Cloud Platform (GCP) or Firebase.
Stars: ✭ 22 (-46.34%)
Mutual labels:  gcp
Opshell
DevOps Toolkit for Every Cloud on Every Cloud
Stars: ✭ 19 (-53.66%)
Mutual labels:  gcp
Dataflow Tutorial
Cloud Dataflow Tutorial for Beginners
Stars: ✭ 17 (-58.54%)
Mutual labels:  bigquery
Secrets Store Csi Driver Provider Gcp
Google Secret Manager provider for the Secret Store CSI Driver.
Stars: ✭ 40 (-2.44%)
Mutual labels:  gcp

Datashare Toolkit

Datashare

DIY commercial datasets on Google Cloud Platform

This is not an officially supported Google product.

The Datashare Toolkit is a solution for data publishers to easily manage datasets residing within BigQuery. The toolkit includes functionality to ingest and entitle data, relieving consumers from much of the toil involved in onboarding datasets from a variety of providers. Publishers upload data files to a storage bucket and allocate permissioned datasets for their consumers to use with BigQuery authorized views.

While these tools are used for data management and entitlement, they follow a bring-your-own-license (BYOL) for entitling publisher data. Hence, publishers should already have licensing arrangements for those consumers withing to access their data within GCP, and the consumers can furnish the GCP account ID's corresponding to their entitled user principals. These account IDs are required for the creation of the authorized views.

The toolkit is open-source. Some supporting infrastructure, such as storage buckets, serverless functions, and BigQuery datasets, must be maintained within GCP by publishers as a prerequisite. As a consumer, when the GCP accounts are added to the publisher entitlements, the published can be queried directly within BigQuery, ready to integrate into your analytics workflow, machine learning model, or runtime application. Publishers are responsible for managing the limited support infrastructure necessary. While consumers are billed for BigQuery compute and networking, publishers incur costs only on the storage of their data in BigQuery and Cloud Storage.

Key Features

Getting started with Datashare

If you plan to use GCP Marketplace integration, the production project that you install and manage Datashare from must follow the required naming convention (punctuation and spaces not allowed): [yourcompanyname]-public.

  1. Setup the Datashare API Manager Service Account
  2. Setup your domain
  3. Setup OAuth credential
  4. Deploy Datashare
  5. Initialize Schema

Then get started, see the User Guide for usage information.

Updating Datashare

Requirements

Publishers

  • A GCP account with billing enabled
  • A Google Cloud Storage bucket to store staged data

Consumers

  • A valid Google Account or Google Group email address (which includes Gsuite and Gmail email addresses).
    Note: Consumers can create a Google account with an existing email address here
  • Entitlements granted by the publisher to your specific licensed datasets

Architecture

Architecture

Disclaimers

This is not an officially supported Google product.

Datashare is under active development. Interfaces and functionality may change at any time.

License

This repository is licensed under the Apache 2 license (see LICENSE).

Contributions are welcome. See CONTRIBUTING for more information.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].