All Projects → r-dbi → Bigrquery

r-dbi / Bigrquery

Licence: other
An interface to Google's BigQuery from R.

Programming Languages

r
7636 projects

Projects that are alternatives of or similar to Bigrquery

Franchise
🍟 a notebook sql client. what you get when have a lot of sequels.
Stars: ✭ 3,823 (+789.07%)
Mutual labels:  bigquery, database
Nodejs Bigquery
Node.js client for Google Cloud BigQuery: A fast, economical and fully-managed enterprise data warehouse for large-scale data analytics.
Stars: ✭ 268 (-37.67%)
Mutual labels:  bigquery, database
Rust Mysql Simple
Mysql client library implemented in rust.
Stars: ✭ 415 (-3.49%)
Mutual labels:  database
H Store
H-Store Distributed Main Memory OLTP Database System
Stars: ✭ 428 (-0.47%)
Mutual labels:  database
Listenbrainz Server
Server for the ListenBrainz project
Stars: ✭ 420 (-2.33%)
Mutual labels:  database
Comuni Json
🇮🇹 Database JSON comuni italiani (2020) con informazioni ISTAT + CAP
Stars: ✭ 416 (-3.26%)
Mutual labels:  database
Lada Cache
A Redis based, fully automated and scalable database cache layer for Laravel
Stars: ✭ 424 (-1.4%)
Mutual labels:  database
Gnorm
A database-first code generator for any language
Stars: ✭ 415 (-3.49%)
Mutual labels:  database
Performance
⏱ PHP performance tool analyser your script on time, memory usage and db query. Support Laravel and Composer for web, web console and command line interfaces.
Stars: ✭ 429 (-0.23%)
Mutual labels:  database
Recipy
Effortless method to record provenance in Python
Stars: ✭ 418 (-2.79%)
Mutual labels:  database
Unrealm
Unrealm is an extension on RealmCocoa, which enables Swift native types to be saved in Realm.
Stars: ✭ 425 (-1.16%)
Mutual labels:  database
Database consistency
The tool to find inconsistency between models schema and database constraints.
Stars: ✭ 418 (-2.79%)
Mutual labels:  database
Hbase
Apache HBase
Stars: ✭ 4,306 (+901.4%)
Mutual labels:  database
Dataset
Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.
Stars: ✭ 4,110 (+855.81%)
Mutual labels:  database
Userdefaultsstore
Why not use UserDefaults to store Codable objects 😉
Stars: ✭ 416 (-3.26%)
Mutual labels:  database
Edge Sql
Cloudflare Workers providing a SQL API
Stars: ✭ 429 (-0.23%)
Mutual labels:  database
Hasql
Performant PostgreSQL driver with a flexible mapping API
Stars: ✭ 415 (-3.49%)
Mutual labels:  database
R2dbc Mysql
R2DBC MySQL Implementation
Stars: ✭ 417 (-3.02%)
Mutual labels:  database
Datasource Proxy
Provide listener framework for JDBC interactions and query executions via proxy.
Stars: ✭ 420 (-2.33%)
Mutual labels:  database
Tempesta
The Linux Application Delivery Controller
Stars: ✭ 429 (-0.23%)
Mutual labels:  database

bigrquery

CRAN Status R build status Codecov test coverage

The bigrquery package makes it easy to work with data stored in Google BigQuery by allowing you to query BigQuery tables and retrieve metadata about your projects, datasets, tables, and jobs. The bigrquery package provides three levels of abstraction on top of BigQuery:

  • The low-level API provides thin wrappers over the underlying REST API. All the low-level functions start with bq_, and mostly have the form bq_noun_verb(). This level of abstraction is most appropriate if you’re familiar with the REST API and you want do something not supported in the higher-level APIs.

  • The DBI interface wraps the low-level API and makes working with BigQuery like working with any other database system. This is most convenient layer if you want to execute SQL queries in BigQuery or upload smaller amounts (i.e. <100 MB) of data.

  • The dplyr interface lets you treat BigQuery tables as if they are in-memory data frames. This is the most convenient layer if you don’t want to write SQL, but instead want dbplyr to write it for you.

Installation

The current bigrquery release can be installed from CRAN:

install.packages("bigrquery")

The newest development release can be installed from GitHub:

# install.packages('devtools')
devtools::install_github("r-dbi/bigrquery")

Usage

Low-level API

library(bigrquery)
billing <- bq_test_project() # replace this with your project ID 
sql <- "SELECT year, month, day, weight_pounds FROM `publicdata.samples.natality`"

tb <- bq_project_query(billing, sql)
bq_table_download(tb, max_results = 10)
#> # A tibble: 10 x 4
#>     year month   day weight_pounds
#>    <int> <int> <int>         <dbl>
#>  1  1969     2     4          6.12
#>  2  1969     4    15          6.44
#>  3  1969     4     8          8.88
#>  4  1969     8    15          6.44
#>  5  1969     1    21          7.50
#>  6  1969     4    14          7.06
#>  7  1969    11     3          6.56
#>  8  1969     2     3          8.13
#>  9  1969    11    20          8.19
#> 10  1969     9     1          6.25

DBI

library(DBI)

con <- dbConnect(
  bigrquery::bigquery(),
  project = "publicdata",
  dataset = "samples",
  billing = billing
)
con 
#> <BigQueryConnection>
#>   Dataset: publicdata.samples
#>   Billing: gargle-169921

dbListTables(con)
#> [1] "github_nested"   "github_timeline" "gsod"            "natality"       
#> [5] "shakespeare"     "trigrams"        "wikipedia"

dbGetQuery(con, sql, n = 10)
#> # A tibble: 10 x 4
#>     year month   day weight_pounds
#>    <int> <int> <int>         <dbl>
#>  1  1969     2     4          6.12
#>  2  1969     4    15          6.44
#>  3  1969     4     8          8.88
#>  4  1969     8    15          6.44
#>  5  1969     1    21          7.50
#>  6  1969     4    14          7.06
#>  7  1969    11     3          6.56
#>  8  1969     2     3          8.13
#>  9  1969    11    20          8.19
#> 10  1969     9     1          6.25

dplyr

library(dplyr)

natality <- tbl(con, "natality")

natality %>%
  select(year, month, day, weight_pounds) %>% 
  head(10) %>%
  collect()
#> # A tibble: 10 x 4
#>     year month   day weight_pounds
#>    <int> <int> <int>         <dbl>
#>  1  1969     3    12          5.81
#>  2  1969     2    18          7.23
#>  3  1969     8    22          7.06
#>  4  1970     4     1          8.56
#>  5  1970     2    20          7.87
#>  6  1970     6    22          6.69
#>  7  1970     4    27          7.50
#>  8  1970     6    21          4.81
#>  9  1969     7     9          6.62
#> 10  1969     8    16          8.44

Important details

Authentication and authorization

When using bigrquery interactively, you’ll be prompted to authorize bigrquery in the browser. Your token will be cached across sessions inside the folder ~/.R/gargle/gargle-oauth/, by default. For non-interactive usage, it is preferred to use a service account token and put it into force via bq_auth(path = "/path/to/your/service-account.json"). More places to learn about auth:

  • Help for bigrquery::bq_auth().
  • How gargle gets tokens.
    • bigrquery obtains a token with gargle::token_fetch(), which supports a variety of token flows. This article provides full details, such as how to take advantage of Application Default Credentials or service accounts on GCE VMs.
  • Non-interactive auth. Explains how to set up a project when code must run without any user interaction.
  • How to get your own API credentials. Instructions for getting your own OAuth client (or “app”) or service account token.

Note that bigrquery requests permission to modify your data; but it will never do so unless you explicitly request it (e.g. by calling bq_table_delete() or bq_table_upload()). Our Privacy policy provides more info.

Billing project

If you just want to play around with the BigQuery API, it’s easiest to start with Google’s free sample data. You’ll still need to create a project, but if you’re just playing around, it’s unlikely that you’ll go over the free limit (1 TB of queries / 10 GB of storage).

To create a project:

  1. Open https://console.cloud.google.com/ and create a project. Make a note of the “Project ID” in the “Project info” box.

  2. Click on “APIs & Services”, then “Dashboard” in the left the left menu.

  3. Click on “Enable Apis and Services” at the top of the page, then search for “BigQuery API” and “Cloud storage”.

Use your project ID as the billing project whenever you work with free sample data; and as the project when you work with your own data.

Useful links

Policies

Please note that the ‘bigrquery’ project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Privacy policy

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].