All Projects → MonetDB → MonetDBLite-R

MonetDB / MonetDBLite-R

Licence: other
MonetDB reconfigured as an R package - See below for an introduction. Edit

Programming Languages

c
50402 projects - #5 most used programming language
Yacc
648 projects
r
7636 projects
CMake
9771 projects
Makefile
30231 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to MonetDBLite-R

MonetDB
This is the official mirror of the MonetDB Mercurial repository. Please note that we do not accept pull requests on github. The regression test results can be found on the MonetDB Testweb http://monetdb.cwi.nl/testweb/web/status.php .For contributions please see: https://www.monetdb.org/Developers
Stars: ✭ 163 (+158.73%)
Mutual labels:  column-store, sql-database, monetdb
DBD-MariaDB
Perl MariaDB driver
Stars: ✭ 28 (-55.56%)
Mutual labels:  dbi
Php Crud Api
Single file PHP script that adds a REST API to a SQL database
Stars: ✭ 2,904 (+4509.52%)
Mutual labels:  sql-database
taint-with-frida
just an experiment
Stars: ✭ 17 (-73.02%)
Mutual labels:  dbi
talaria
TalariaDB is a distributed, highly available, and low latency time-series database for Presto
Stars: ✭ 148 (+134.92%)
Mutual labels:  column-store
AndroidEasySQL-Library
An Easier & Lazier approach to SQL database for Android
Stars: ✭ 28 (-55.56%)
Mutual labels:  sql-database
Desktop-Applications-JavaFX
JavaFX Open Source Projects
Stars: ✭ 69 (+9.52%)
Mutual labels:  sql-database
bank-statement-analysis
Flask application generating interactive visualisations from bank statements PDF documents
Stars: ✭ 31 (-50.79%)
Mutual labels:  sql-database
boxball
Prebuilt Docker images with Retrosheet's complete baseball history data for many analytical frameworks. Includes Postgres, cstore_fdw, MySQL, SQLite, Clickhouse, Drill, Parquet, and CSV.
Stars: ✭ 79 (+25.4%)
Mutual labels:  column-store
dm
Working with relational data models in R
Stars: ✭ 358 (+468.25%)
Mutual labels:  dbi
casewhen
Create reusable dplyr::case_when() functions
Stars: ✭ 64 (+1.59%)
Mutual labels:  dbi
perl-dbd-firebird
Perl DBI driver for Firebird
Stars: ✭ 14 (-77.78%)
Mutual labels:  dbi
MonetDBLite-Python
MonetDBLite as a Python Package
Stars: ✭ 32 (-49.21%)
Mutual labels:  monetdb
Stampede
The ToroDB solution to provide better analytics on top of MongoDB and make it easier to migrate from MongoDB to SQL
Stars: ✭ 1,754 (+2684.13%)
Mutual labels:  sql-database
DBD-mysql
MySQL driver for the Perl5 Database Interface (DBI)
Stars: ✭ 50 (-20.63%)
Mutual labels:  dbi
migrate-Java-EE-app-to-azure
Migrate an existing Java EE workload to Azure
Stars: ✭ 12 (-80.95%)
Mutual labels:  sql-database
crate ruby
A Ruby client library for CrateDB.
Stars: ✭ 31 (-50.79%)
Mutual labels:  sql-database
activerecord-crate-adapter
Ruby on Rails ActiveRecord adapter for CrateDB
Stars: ✭ 27 (-57.14%)
Mutual labels:  sql-database
databases-w-r
Databases with R, the latest - rstudio::conf2019
Stars: ✭ 33 (-47.62%)
Mutual labels:  dbi
intelli-swift-core
Distributed, Column-oriented storage, Realtime analysis, High performance Database
Stars: ✭ 17 (-73.02%)
Mutual labels:  column-store

This repository has been archived. Please see MonetDB/e (code examples here) for the replacement project.

MonetDBLite for R

Build Status Build Status CRAN_Status_Badge

MonetDBLite for R is a SQL database that runs inside the R environment for statistical computing and does not require the installation of any external software. MonetDBLite is based on free and open-source MonetDB, a product of the Centrum Wiskunde & Informatica.

MonetDBLite is similar in functionality to RSQLite, but typically completes queries blazingly fast due to its columnar storage architecture and bulk query processing model. Since both of these embedded SQL options rely on the the R DBI interface, the conversion of legacy RSQLite project syntax over to MonetDBLite code should be a cinch.

MonetDBLite works seamlessly with the dplyr grammar of data manipulation. For a detailed tutorial of how to work with database-backed dplyr commands, see the dplyr databases vignette. To reproduce this vignette using MonetDBLite rather than RSQLite, simply replace the functions ending with *_sqlite with the suffix *_monetdblite instead.

Installation

  • the latest released version from CRAN with

    install.packages("MonetDBLite")
  • the latest development version from github using devtools

    devtools::install_github("hannesmuehleisen/MonetDBLite-R")
    

If you encounter a bug, please file a minimal reproducible example on github. For questions and other discussion, please use stack overflow with the tag monetdblite. The development version of MonetDBLite endures sisyphean perpetual testing on both unix and windows machines.

Speed Comparisons

MonetDBLite outperforms all other SQL databases currently accessible by the R language and ranks competitively among other High Performace Computing options available to R users. For more detail, see Szilard Pafka's benchmarks.

Painless Startup

If you want to store a database permanently (or to reconnect to a previously-initiated one), set the dbdir to some folder path on your local machine. A new database that you would like to store permanently should be directed to an empty folder:

library(DBI)
dbdir <- "C:/path/to/database_directory"
con <- dbConnect(MonetDBLite::MonetDBLite(), dbdir)

To create a temporary server, create a DBI connection as follows:

library(DBI)
con <- dbConnect(MonetDBLite::MonetDBLite())

Note that the above temporary server command is equivalent to initiating the server in the tempdir() of your R session:

library(DBI)
dbdir <- tempdir()
con <- dbConnect(MonetDBLite::MonetDBLite(), dbdir)

Note that MonetDB may hiccup when using network drives, use servers stored on the same machine as the R session.

Versatile Data Importation

To efficiently copy a data.frame object into a table within the MonetDBLite database, use dbWriteTable:

# directly copy a data.frame object to a table within the database
dbWriteTable(con, "mtcars", mtcars)

To load a CSV file into a table within the database, provide the local file path of a .csv file to dbWriteTable:

# construct an example CSV file on the local disk
csvfile <- tempfile()
write.csv(mtcars, csvfile, row.names = FALSE)

# directly copy a csv file to a table within the database
dbWriteTable(con, "mtcars2", csvfile)

# append the same table to the bottom of the previous table
dbWriteTable(con, "mtcars2", csvfile, append=TRUE)

# overwrite the table with a new table
dbWriteTable(con, "mtcars2", csvfile, overwrite=TRUE)

The SQL interface of MonetDBLite can also be used to manually create a table and import data:

# construct an example CSV file on the local disk
csvfile <- tempfile()
write.csv(mtcars, csvfile, row.names = FALSE)

# start a SQL transaction
dbBegin(con)

# construct an empty table within the database, using a manually-defined structure
dbSendQuery(con, "CREATE TABLE mtcars3 (mpg DOUBLE PRECISION, cyl INTEGER, disp DOUBLE PRECISION, hp INTEGER, drat DOUBLE PRECISION, wt DOUBLE PRECISION, qsec DOUBLE PRECISION, vs INTEGER, am INTEGER, gear INTEGER, carb INTEGER)")

# copy the contents of a CSV file into the database, using the MonetDB COPY INTO command
dbSendQuery(con, paste0("COPY OFFSET 2 INTO mtcars3 FROM '", csvfile, "' USING DELIMITERS ',','\n','\"' NULL as ''"))

# finalize the SQL transaction
dbCommit(con)

Note how we wrap the two commands in a transaction using dbBegin and dbCommit. This creates all-or-nothing semantics. See the MonetDB documentation for details on how to create a table and how to perform bulk input.

Reading and Writing (Queries and Updates)

This section reviews how to pass SQL queries to an embedded server session and then pull those results into R. If you are interested in learning SQL syntax, perhaps review the w3schools SQL tutorial or the MonetDB SQL Reference Manual.

The dbGetQuery function sends a SELECT statement to the server, then returns the result as a data.frame:

# calculate the average miles per gallon, grouped by number of cylinders
dbGetQuery(con, "SELECT cyl, AVG(mpg) FROM mtcars GROUP BY cyl" )

# calculate the number of records in the _mtcars_ table
dbGetQuery(con, "SELECT COUNT(*) FROM mtcars" )

The dbSendQuery function can open a connection to some read-only query. Once initiated, the res object below can then be accessed repeatedly with a fetch command:

res <- dbSendQuery(con, "SELECT wt, gear FROM mtcars")
first_sixteen_records <- fetch(res, n=16)
dbHasCompleted(res)
second_sixteen_records <- fetch(res, n=16)
dbHasCompleted(res)
dbClearResult(res)

The dbSendQuery function should also be used to make edits to tables within the database:

# add a new column of kilometers per liter
dbSendQuery(con, "ALTER TABLE mtcars ADD COLUMN kpl DOUBLE PRECISION" )

# populate that new column with kilometers per liter
dbSendQuery(con, "UPDATE mtcars SET kpl = mpg * 0.425144" )

Glamorous Data Export

The contents of an entire table within the database can be transferred to an R data.frame object with dbReadTable. Since MonetDBLite is most useful for the storage and analysis of large datasets, there might be limited utility to copying an entire table into working RAM in R. The dbReadTable function and a SQL SELECT * FROM tablename command are equivalent:

# directly copy a table within the database to an R data.frame object
x <- dbReadTable(con, "mtcars")

# directly copy a table within the database to an R data.frame object
y <- dbGetQuery(con, "SELECT * FROM mtcars" )

Special database informational functions

Certain administrative commands can be sent using either dbSendQuery or with a custom DBI function:

# remove the table `mtcars2` from the database
dbSendQuery(con, "DROP TABLE mtcars2" )

# remove the table `mtcars3` from the database
dbRemoveTable(con, "mtcars3" )

Other administrative commands can be sent using dbGetQuery or with a custom DBI function:

# list the column names of the mtcars table within the database
names(dbGetQuery(con, "SELECT * FROM mtcars LIMIT 1" ))

# list the column names of the mtcars table within the database
dbListFields(con, "mtcars" )

Still other administrative commands are much easier to simply use the custom DBI function:

# print the names of all tables within the current database
dbListTables(con)

Shutdown

MonetDBLite allows multiple concurrent connections to a single database, but does not allow more than one concurrent embedded server session (actively-running database). This is not an issue for most users since a single database can store thousands of individual tables. To switch between databases, however, the first server must be shut down before the second can be opened. To shutdown a server, include the shutdown=TRUE parameter:

dbDisconnect(con, shutdown=TRUE)

To globally shut down the embedded server session without the con connection object, use:

MonetDBLite::monetdblite_shutdown()

MonetDBLite does not allow multiple R sessions to connect to a single database concurrently. As soon as a single R session loads an embedded server, that server is locked down and inaccessible to other R consoles.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].