All Projects → MartijnSch → amplitude-bigquery

MartijnSch / amplitude-bigquery

Licence: MIT license
Export your events from Amplitude to Google BigQuery/Google Cloud Storage

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to amplitude-bigquery

iris3
An upgraded and improved version of the Iris automatic GCP-labeling project
Stars: ✭ 38 (+35.71%)
Mutual labels:  bigquery, cloud-storage
bigflow
A Python framework for data processing on GCP.
Stars: ✭ 96 (+242.86%)
Mutual labels:  bigquery
ossperf
A lightweight tool for analyzing the performance and data integrity of object-based storage services
Stars: ✭ 67 (+139.29%)
Mutual labels:  cloud-storage
ekstertera
Linux GUI клиент для работы с Яндекс.Диск (Yandex.Disk) через REST API
Stars: ✭ 33 (+17.86%)
Mutual labels:  cloud-storage
bigquery-data-lineage
Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.
Stars: ✭ 112 (+300%)
Mutual labels:  bigquery
go-drive
A simple cloud drive mapping web app supports local, FTP/SFTP, S3, OneDrive, WebDAV, Google Drive.
Stars: ✭ 184 (+557.14%)
Mutual labels:  cloud-storage
react-amplitude-hooks
Amplitude React components supporting hooks
Stars: ✭ 25 (-10.71%)
Mutual labels:  amplitude
objectiv-analytics
Powerful product analytics for data teams, with full control over data & models.
Stars: ✭ 399 (+1325%)
Mutual labels:  bigquery
hive-bigquery-storage-handler
Hive Storage Handler for interoperability between BigQuery and Apache Hive
Stars: ✭ 16 (-42.86%)
Mutual labels:  bigquery
OptimizeRasters
OptimizeRasters is a set of tools for converting raster data to optimized Tiled TIF or MRF files, moving data to cloud storage, and creating Raster Proxies.
Stars: ✭ 105 (+275%)
Mutual labels:  cloud-storage
SaorTech-cloud-services
A range of scripts to provision and configure open source cloud services.
Stars: ✭ 23 (-17.86%)
Mutual labels:  cloud-storage
gsc-logger
Google Search Console Logger for Google App Engine
Stars: ✭ 38 (+35.71%)
Mutual labels:  bigquery
Amplitude-Node
Server-side Node.js SDK for Amplitude
Stars: ✭ 63 (+125%)
Mutual labels:  amplitude
tag-manager
Website analytics, JavaScript error tracking + analytics, tag manager, data ingest endpoint creation (tracking pixels). GDPR + CCPA compliant.
Stars: ✭ 279 (+896.43%)
Mutual labels:  bigquery
hybris
Robust and strongly consistent hybrid cloud storage library
Stars: ✭ 13 (-53.57%)
Mutual labels:  cloud-storage
firestore-to-bigquery-export
NPM package for copying and converting Cloud Firestore data to BigQuery.
Stars: ✭ 26 (-7.14%)
Mutual labels:  bigquery
dekart
GIS Visualisation for Amazon Athena and BigQuery
Stars: ✭ 131 (+367.86%)
Mutual labels:  bigquery
ilab-media-tools
mediacloud.press/
Stars: ✭ 98 (+250%)
Mutual labels:  cloud-storage
logica
Logica is a logic programming language that compiles to StandardSQL and runs on Google BigQuery.
Stars: ✭ 1,469 (+5146.43%)
Mutual labels:  bigquery
polygon-etl
ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (+89.29%)
Mutual labels:  bigquery

Amplitude > Google Cloud Storage > Google BigQuery

Export your Amplitude data to Google BigQuery for big data analysis. This script will download all events & properties from the Amplitude Export API, parse the data and prepare a data job for Google BigQuery by storing the data for backup purposes in Google Cloud Storage.

Read more about this integration here on the blog of Martijn Scheijbeler.

forthebadge

Features / Support

  • Download data for a full day from Amplitude using the Export API
  • Parse the data to match data types in Google BigQuery
  • Export new parsed files for a load data job in Google BigQuery
  • Store backup data in Google Cloud Storage
  • Cleans up after use, all temporary files will be deleted.

Quick start:

  1. Clone this repository: git clone [email protected]:MartijnSch/amplitude-bigquery.git
  2. Fill in your Amplitude Account ID: ACCOUNT_ID, you can find this in Amplitude under your account settings: Settings > Projects
  3. Fill in your Amplitude API Key & Secret Key, you can find this in Amplitude under your account settings: Settings > Projects
  4. Create a new project on Google Cloud Platform
  5. Add the Project ID to the script
  6. Activate Google Cloud Storage and Google BigQuery + filled in your billing details
  7. Create a bucket in Google Cloud storage with two folders: export & import
  8. Load both schemas (bigquery-schema-events.json & bigquery-schema-events-properties.json) into Google BigQuery to create the tables
  9. Adjust the Constant variables in amplitude-bigquery.py
  10. Run the script via: python amplitude-bigquery.py
  11. Look at the backup files in Google Cloud Storage and see the data in Google BigQuery

BigQuery Schemas

To add the events and events properties to Google BigQuery you'll need to create the two tables. You'll find the JSON schema in the files in this repository, these are the Schema Text fields that you can also use.

Create a new table in BigQuery

Events:

client_event_time:TIMESTAMP,ip_address:STRING,library:STRING,dma:STRING,user_creation_time:TIMESTAMP,insert_id:STRING,schema:INTEGER,processed_time:TIMESTAMP,client_upload_time:TIMESTAMP,app:INTEGER,user_id:STRING,city:STRING,event_type:STRING,device_carrier:STRING,location_lat:STRING,event_time:TIMESTAMP,platform:STRING,is_attribution_event:BOOLEAN,os_version:INTEGER,paying:BOOLEAN,amplitude_id:INTEGER,device_type:STRING,sample_rate:STRING,device_manufacturer:STRING,start_version:STRING,uuid:STRING,version_name:STRING,location_lng:STRING,server_upload_time:TIMESTAMP,event_id:INTEGER,device_id:STRING,device_family:STRING,os_name:STRING,adid:STRING,amplitude_event_type:STRING,device_brand:STRING,country:STRING,device_model:STRING,language:STRING,region:STRING,session_id:INTEGER,idfa:STRING

Events Properties:

property_type:STRING,insert_id:STRING,key:STRING,value:STRING

History

March 20, 2018

  • Initial Commit: Add the support for exporting Amplitude data to Google BigQuery.

Want to contribute?

Contributions are welcome! There are just a few requested guidelines:

  • Please create a feature branch for your changes and squash commits.
  • Don't worry about updating the version, changelog, or minified version.
  • Please respect the original syntax/formatting stuff.
  • If proposing a new feature, it may be a good idea to create an issue first to discuss.

Maintainer history

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].