All Projects → opensdmx → rsdmx

opensdmx / rsdmx

Licence: other
Tools for reading SDMX data and metadata in R

Programming Languages

r
7636 projects

Projects that are alternatives of or similar to rsdmx

CS Offer
后台开发基础知识总结(春招/秋招)
Stars: ✭ 352 (+278.49%)
Mutual labels:  datastructures
colocat
Fegeya Colocat, Colorized 'cat' implementation. Written in C++17.
Stars: ✭ 14 (-84.95%)
Mutual labels:  read
armagarch
ARMA-GARCH
Stars: ✭ 59 (-36.56%)
Mutual labels:  timeseries
MachineLearning
Machine learning for beginner(Data Science enthusiast)
Stars: ✭ 104 (+11.83%)
Mutual labels:  timeseries
readability-cli
A CLI for Mozilla Readability. Get clean, uncluttered, ready-to-read HTML from any webpage!
Stars: ✭ 41 (-55.91%)
Mutual labels:  read
nifi-influxdb-bundle
InfluxDB Processors For Apache NiFi
Stars: ✭ 30 (-67.74%)
Mutual labels:  timeseries
easy reader
⏮ ⏯ ⏭ A Rust library for easily navigating forward, backward or randomly through the lines of huge files.
Stars: ✭ 83 (-10.75%)
Mutual labels:  read
Flight-Booking-System-JavaServlets App
✈️ An enterprise level Flight Booking System for Turkish Airlines (web-application) based on the Model View Controller (MVC) Architecture made using Java Servlets, Java Server Pages (JSPs). Moreover authentication and authorization for users is implemented. The web-application is also secured against SQL Injection and Cross-Site Scripting attacks.
Stars: ✭ 107 (+15.05%)
Mutual labels:  web-services
AlphaVantageRB
A Gem for AlphaVantage
Stars: ✭ 68 (-26.88%)
Mutual labels:  timeseries
DSA--GeeksForGeeks
DSA course solutions in C++ Jump to below directly for more problems
Stars: ✭ 47 (-49.46%)
Mutual labels:  datastructures
datastructures-algorithms
List of Programs related to data structures and algorithms
Stars: ✭ 68 (-26.88%)
Mutual labels:  datastructures
readthat
Read Text Data
Stars: ✭ 27 (-70.97%)
Mutual labels:  read
PCF8591 library
Library to use i2c analog IC with arduino and esp8266. Can read analog value and write analog value with only 2 wire (perfect for ESP-01).
Stars: ✭ 24 (-74.19%)
Mutual labels:  read
TimeseriesSurrogates.jl
A Julia package for generating timeseries surrogates
Stars: ✭ 35 (-62.37%)
Mutual labels:  timeseries
kaggle-recruit-restaurant
🏆 Kaggle 8th place solution
Stars: ✭ 102 (+9.68%)
Mutual labels:  timeseries
influxdbr
R Interface for InfluxDB
Stars: ✭ 95 (+2.15%)
Mutual labels:  timeseries
http4s-poc-api
POC: http4s http api on zio
Stars: ✭ 34 (-63.44%)
Mutual labels:  web-services
crud-app
❄️ A simple and beautiful CRUD application built with React.
Stars: ✭ 61 (-34.41%)
Mutual labels:  read
DSA-cpp
C++ fundamentals and questions for beginners and intermediates. Welcoming developers, content writers, and programming enthusiasts.
Stars: ✭ 213 (+129.03%)
Mutual labels:  datastructures
covid19-timeseries
Covid19 timeseries data store
Stars: ✭ 38 (-59.14%)
Mutual labels:  timeseries

rsdmx

Build Status CRAN_Status_Badge cran checks Github_Status_Badge DOI

rsdmx: Tools for reading SDMX data and metadata documents in R

Overview

rsdmx is a package to parse/read SDMX data and metadata in R. It provides:

  • a set of classes and methods to read data and metadata documents exchanged through the Statistical Data and Metadata Exchange (SDMX) framework. The package currently focuses on the SDMX XML standard format (SDMX-ML).
  • an interface to SDMX web-services for a list of well-known data providers, such as EUROSTAT, OECD, and others Learn more.

Citation

We thank in advance people that use rsdmx for citing it in their work / publication(s). For this, please use the citation provided at this link DOI

Collating scattered SDMX data sources

In spite they are some R package initiatives relying on rsdmx that aim to provide a wrapper for a single data source (e.g. OECD, EUROSTAT), it is strongly recommended to rely directly on rsdmx. Indeed, one main objective of rsdmx is to promote and facilitate collating scattered data from a growing number of SDMX data providers, whatever the organization.

It is already possible to query well-known datasources, using the embedded helpers. Pull requests are welcome to support additional data providers by default in rsdmx.

SDMX standards compliance

SDMX_Compliance_Badge_1.0 SDMX_Compliance_Badge_2.0 SDMX_Compliance_Badge_2.1

Status

At now, the package allows to read:

  • Datasets (GenericData, CompactData, StructureSpecificData, StructureSpecificTimeSeriesData, CrossSectionalData, UtilityData and MessageGroup SDMX-ML types)
  • Concepts (Concept, ConceptScheme and Concepts SDMX-ML types)
  • Codelists (Code, Codelist and Codelists SDMX-ML types)
  • DataStructures / KeyFamilies - with all subtypes
  • Data Structure Definitions (DSDs) - with all subtypes

Fundings

rsdmx is looking for sponsors. You have been using rsdmx and you wish to support its development? Please help us to make the package growing!

Author

Copyright (C) 2014 Emmanuel Blondel

Contributors

  • Matthieu Stigler
  • Eric Persson

Distribution

on CRAN

rsdmx is available on the Comprehensive R Archive Network (CRAN). See the R CRAN check results at: https://cran.r-project.org/web/checks/check_results_rsdmx.html

Please note that following a new submission to CRAN, or eventually a modification of CRAN policies, the package might be temporarily archived, and removed from CRAN. In case you notice that the package is not back in few time, please contact me.

on R-Universe

rsdmx is available on the R-Universe public cloud server. The package version corresponds to the ongoing revision (master branch in Github). See https://opensdmx.r-universe.dev/ui#package:rsdmx

Quickstart

rsdmx offers a low-level set of tools to read data and metadata in SDMX format. Its strategy is to make it very easy for the user. For this, a unique function named readSDMX has to be used, whatever it is a data or metadata document, or if it is local or remote datasource.

It is important to highlight that one of the major benefits of rsdmx is to focus first on the SDMX format specifications (acting as format abstraction library). This allows rsdmx reading SDMX data from remote datasources, or from local SDMX files. For accessing remote datasources, it also means that rsdmx does not bound to SDMX service specifications, and can read a wider ranger of datasources.

Install rsdmx

rsdmx can be installed from CRAN

install.packages("rsdmx")

or from its development repository hosted in Github (using the devtools package):

devtools::install_github("opensdmx/rsdmx")

Load rsdmx

To load rsdmx in R, do the following:

library(rsdmx)

readSDMX & helper functions

readSDMX as low-level function

The readSDMX function is then first designed at low-level so it can take as parameters a url (isURL=TRUE by default) or a file. So wherever is located the SDMX document, readSDMX will allow you to read it, as follows:


  #read a remote file
  sdmx <- readSDMX(file = "someUrl")
  
  #read a local file
  sdmx <- readSDMX(file = "somelocalfile", isURL = FALSE)

In addition, in order to facilitate querying datasources, readSDMX also providers helpers to query well-known remote datasources. This allows not to specify the entire URL, but rather specify a simple provider ID, and the different parameters to build a SDMX query (e.g. for a dataset query: operation, key, filter, startPeriod and endPeriod).

This is made possible as a list of SDMX service providers is embedded within rsdmx, and such list provides all the information required for readSDMX to build the SDMX request (url) before accessing the datasource.

get list of SDMX service providers

The list of known SDMX service providers can be queried as follows:


providers <- getSDMXServiceProviders()
as.data.frame(providers)

create/add a SDMX service provider

It also also possible to create and add a new SDMX service providers in this list (so readSDMX can be aware of it). A provider can be created with the SDMXServiceProvider, and is made of various parameters:

  • agencyId (provider identifier)
  • name
  • scale (international or national)
  • country ISO 3-alpha code (if national)
  • builder

The request builder can be created with SDMXRequestBuilder which takes various arguments:

  • regUrl: URL of the service registry endpoint
  • repoUrl: URL of the service repository endpoint (Note that we use 2 different arguments for registry and repository endpoints, since some providers use different URLs, but in most cases those are identical)
  • formatter list of functions to format the request params (one function per type of resource, e.g. "dataflow", "datastructure", "data")
  • handler list of functions which will allow to build the web request *compliant logical parameter (either the request builder is compliant with some web-service specifications)

rsdmx yet provides common builders, that can be customized if needed, by overriding either the formatter or the handler functions:

  • SDMXREST20RequestBuilder: connector for SDMX REST 2.0 web-services
  • SDMXREST21RequestBuilder: connector for SDMX REST 2.1 web-services
  • SDMXDotStatRequestBuilder: connector for SDMX .Stat ("DotStat") web-services implementations

Let's see it with an example:

First create a request builder for our provider:


myBuilder <- SDMXRequestBuilder(
  regUrl = "http://www.myorg.org/sdmx/registry",
  repoUrl = "http://www.myorg.org/sdmx/repository",
  formatter = list(
    dataflow = function(obj){
      #format each dataflow id with some prefix
      obj@resourceId <- paste0("df_",obj@resourceId)
      return(obj)
    },
    datastructure = function(obj){
      #do nothing
      return(obj)
    },
    data = function(obj){
      #format each dataset id with some prefix
      obj@flowRef <- paste0("data_",obj@flowRef)
      return(obj)
    }
  ),
  handler = list(
    dataflow = function(obj){
      req <- sprintf("%s/dataflow",obj@regUrl)
      return(req)
    },
    datastructure = function(obj){
      req <- sprintf("%s/datastructure",obj@regUrl)
      return(req)
    },
    data = function(obj){
      req <- sprintf("%s/data",obj@regUrl)
      return(req)
    }
  ),
  compliant = FALSE
)

As you can see, we built a custom SDMXRequestBuilder that will be able to create SDMX web-requests for the different resources of a SDMX web-service.

We can create a provider with the above request builder, and add it to the list of known SDMX service providers:


#create the provider
provider <- SDMXServiceProvider(
agencyId = "MYORG",
name = "My Organization",
builder = myBuilder
)

#add it to the list
addSDMXServiceProvider(provider)

#check provider has been added
as.data.frame(getSDMXServiceProviders())


find a SDMX service provider

A another helper allows you to interrogate rsdmx if a specific provider is known, given an id:

oecd <- findSDMXServiceProvider("OECD")

readSDMX as helper function

Now you know how to add a SDMX provider, you can consider using readSDMX without having to specifying a entire URL, but just by specifying the agencyId of the provider, and the different query parameters to reach your SDMX document:

sdmx <- readSDMX(providerId = "MYORG", providerKey = NULL resource = "data", flowRef="MYSERIE",
                 key = "all", key.mode = "SDMX", start = 2000, end = 2015)

For embedded service providers that require a user authentication/subscription key or token, it is possible to specify it in readSDMX with the providerKey argument. If provided, and that the embedded provider requires a specific key parameter, the latter will be appended to the SDMX web-request. For example, it's the case for the new UNESCO SDMX API.

The following sections will show you how to query SDMX documents, by using readSDMX in different ways: either for local or remote files, using readSDMX as low-level or with the helpers (embedded service providers).

Read dataset documents

This section will introduce you on how to read SDMX dataset documents.

Read remote datasets

The following code snipet shows you how to read a dataset from a remote data source, taking as example the OECD StatExtracts portal: https://stats.oecd.org/restsdmx/sdmx.ashx/GetData/MIG/TOT../OECD?startTime=2000&endTime=2011

myUrl <- "https://stats.oecd.org/restsdmx/sdmx.ashx/GetData/MIG/TOT../OECD?startTime=2000&endTime=2011"
dataset <- readSDMX(myUrl)
stats <- as.data.frame(dataset) 

You can try it out with other datasources, such as:

The online rsdmx documentation also provides a list of data providers, either from international or national institutions.

Now, the service providers above mentioned are known by rsdmx which let users using readSDMX with the helper parameters. It may also be the case for a provider that you register in rsdmx.

Let's see how it would look like for querying an OECD datasource:

sdmx <- readSDMX(providerId = "OECD", resource = "data", flowRef = "MIG",
                key = list("TOT", NULL, NULL), start = 2010, end = 2011)
df <- as.data.frame(sdmx)
head(df)

It is also possible to query a dataset together with its "definition", handled in a separate SDMX-ML document named DataStructureDefinition (DSD). It is particularly useful when you want to enrich your dataset with all labels. For this, you need the DSD which contains all reference data.

To do so, you only need to append dsd = TRUE (default value is FALSE), to the previous request, and specify labels = TRUE when calling as.data.frame, as follows:

sdmx <- readSDMX(providerId = "OECD", resource = "data", flowRef = "MIG",
                key = list("TOT", NULL, NULL), start = 2010, end = 2011,
                dsd = TRUE)
df <- as.data.frame(sdmx, labels = TRUE)
head(df)

Note that in case you are reading SDMX-ML documents with the native approach (with URLs), instead of the embedded providers, it is also possible to associate a DSD to a dataset by using the function setDSD. Let's try how it works:

#data without DSD
sdmx.data <- readSDMX(providerId = "OECD", resource = "data", flowRef = "MIG",
                key = list("TOT", NULL, NULL), start = 2010, end = 2011)

#DSD
sdmx.dsd <- readSDMX(providerId = "OECD", resource = "datastructure", resourceId = "MIG")

#associate data and dsd
sdmx.data <- setDSD(sdmx.data, sdmx.dsd)

Read local datasets

This example shows you how to use rsdmx with local SDMX files, previously downloaded from EUROSTAT.

#bulk download from Eurostat
tf <- tempfile(tmpdir = tdir <- tempdir()) #temp file and folder
download.file("https://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?sort=1&file=data%2Frd_e_gerdsc.sdmx.zip", tf)
sdmx_files <- unzip(tf, exdir = tdir)

sdmx <- readSDMX(sdmx_files[2], isURL = FALSE)
stats <- as.data.frame(sdmx)
head(stats)

By default, readSDMX considers the data source is remote. To read a local file, add isURL = FALSE.

Read metadata documents

This section will introduce you on how to read SDMX metadata complete data structure definitions (DSD)


#### Data Structures

This example illustrates how to read the complete list of data structures (or key families) from the [OECD StatExtracts portal](https://stats.oecd.org)

```{r, echo = FALSE}
dsUrl <- "https://stats.oecd.org/restsdmx/sdmx.ashx/GetDataStructure/ALL"
ds <- readSDMX(dsUrl)
dsdf <- as.data.frame(ds)
head(dsdf)

Data Structure Definition (DSD)

This example illustrates how to read a complete DSD using a OECD StatExtracts portal data source.

dsdUrl <- "https://stats.oecd.org/restsdmx/sdmx.ashx/GetDataStructure/TABLE1"
dsd <- readSDMX(dsdUrl)

rsdmx is implemented in object-oriented way with S4 classes and methods. The properties of S4 objects are named slots and can be accessed with the slot method. The following code snippet allows to extract the list of codelists contained in the DSD document, and read one codelist as data.frame.

#get codelists from DSD
cls <- slot(dsd, "codelists")
codelists <- sapply(slot(cls, "codelists"), function(x) slot(x, "id")) #get list of codelists
codelist <- as.data.frame(slot(dsd, "codelists"), codelistId = "CL_TABLE1_FLOWS") #get a codelist

In a similar way, the concepts of the dataset can be extracted from the DSD and read as data.frame.

#get concepts from DSD
concepts <- as.data.frame(slot(dsd, "concepts"))

Save & Reload SDMX R objects

It is possible to save SDMX R objects as RData file (.RData, .rda, .rds), to then be able to reload them into the R session. It could be of added value for users that want to keep their SDMX objects in R data files, but also for fast loading of large SDMX objects (e.g. DSD objects) for use in statistical analyses and R-based web-applications.

To save a SDMX R object to RData file:

saveSDMX(sdmx, "tmp.RData")

To reload a SDMX R object from RData file:

sdmx <- readSDMX("tmp.RData", isRData = TRUE)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].