All Projects → ropensci → oai

ropensci / oai

Licence: Unknown, MIT licenses found Licenses found Unknown LICENSE MIT LICENSE.md
OAI-PMH R client

Programming Languages

r
7636 projects
Makefile
30231 projects

Projects that are alternatives of or similar to oai

getCRUCLdata
CRU CL v. 2.0 Climatology Client for R
Stars: ✭ 17 (+30.77%)
Mutual labels:  r-package, data-access, peer-reviewed
cyphr
Humane encryption
Stars: ✭ 91 (+600%)
Mutual labels:  r-package, peer-reviewed
riem
✈️ ☀️ R package for accessing ASOS data via the Iowa Environment Mesonet ☁️ ✈️
Stars: ✭ 38 (+192.31%)
Mutual labels:  r-package, peer-reviewed
rdryad
R client for Dryad web services
Stars: ✭ 25 (+92.31%)
Mutual labels:  r-package, oai-pmh
bittrex
A R Client for the Bittrex Crypto-Currency Exchange
Stars: ✭ 26 (+100%)
Mutual labels:  r-package, peer-reviewed
nasapower
API Client for NASA POWER Global Meteorology, Surface Solar Energy and Climatology in R
Stars: ✭ 79 (+507.69%)
Mutual labels:  r-package, data-access
medrxivr
Access and search medRxiv and bioRxiv preprint data
Stars: ✭ 34 (+161.54%)
Mutual labels:  r-package, peer-reviewed
weathercan
R package for downloading weather data from Environment and Climate Change Canada
Stars: ✭ 83 (+538.46%)
Mutual labels:  r-package, peer-reviewed
roadoi
Use Unpaywall with R
Stars: ✭ 60 (+361.54%)
Mutual labels:  r-package, peer-reviewed
suppdata
Grabbing SUPPlementary DATA in R
Stars: ✭ 31 (+138.46%)
Mutual labels:  r-package, peer-reviewed
nlrx
nlrx NetLogo R
Stars: ✭ 66 (+407.69%)
Mutual labels:  r-package, peer-reviewed
rrlite
R interface to rlite https://github.com/seppo0010/rlite
Stars: ✭ 16 (+23.08%)
Mutual labels:  r-package, peer-reviewed
ropenaq
⛔ ARCHIVED ⛔ Accesses Air Quality Data from the Open Data Platform OpenAQ
Stars: ✭ 69 (+430.77%)
Mutual labels:  r-package, peer-reviewed
geoparser
⛔ ARCHIVED ⛔ R package for the Geoparser.io API
Stars: ✭ 38 (+192.31%)
Mutual labels:  r-package, peer-reviewed
opencage
🌐 R package for the OpenCage API -- both forward and reverse geocoding 🌐
Stars: ✭ 82 (+530.77%)
Mutual labels:  r-package, peer-reviewed
NLMR
📦 R package to simulate neutral landscape models 🏔
Stars: ✭ 57 (+338.46%)
Mutual labels:  r-package, peer-reviewed
nomisr
Access UK official statistics from the Nomis database through R.
Stars: ✭ 30 (+130.77%)
Mutual labels:  r-package, peer-reviewed
PostcodesioR
API wrapper around postcodes.io - free UK postcode lookup and geocoder
Stars: ✭ 36 (+176.92%)
Mutual labels:  r-package, peer-reviewed
rdefra
rdefra: Interact with the UK AIR Pollution Database from DEFRA
Stars: ✭ 14 (+7.69%)
Mutual labels:  r-package, peer-reviewed
rdflib
📦 High level wrapper around the redland package for common rdf applications
Stars: ✭ 47 (+261.54%)
Mutual labels:  r-package, peer-reviewed

Project Status: Active – The project has reached a stable, usable state and is being actively developed. R-check cran checks codecov.io rstudio mirror downloads cran version

oai is an R client to work with OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting) services, a protocol developed by the Open Archives Initiative (https://en.wikipedia.org/wiki/Open_Archives_Initiative). OAI-PMH uses XML data format transported over HTTP.

OAI-PMH Info:

oai is built on xml2 and httr. In addition, we give back data.frame's whenever possible to make data comprehension, manipulation, and visualization easier. We also have functions to fetch a large directory of OAI-PMH services - it isn't exhaustive, but does contain a lot.

OAI-PMH instead of paging with e.g., page and per_page parameters, uses (optionally) resumptionTokens, optionally with an expiration date. These tokens can be used to continue on to the next chunk of data, if the first request did not get to the end. Often, OAI-PMH services limit each request to 50 records, but this may vary by provider, I don't know for sure. The API of this package is such that we while loop for you internally until we get all records. We may in the future expose e.g., a limit parameter so you can say how many records you want, but we haven't done this yet.

Install

Install from CRAN

install.packages("oai")

Development version

devtools::install_github("ropensci/oai")
library("oai")

Identify

id("http://oai.datacite.org/oai")
#>   repositoryName                      baseURL protocolVersion
#> 1       DataCite https://oai.datacite.org/oai             2.0
#>             adminEmail    earliestDatestamp deletedRecord          granularity
#> 1 [email protected] 2011-01-01T00:00:00Z    persistent YYYY-MM-DDThh:mm:ssZ
#>   compression compression.1                                    description
#> 1        gzip       deflate oaioai.datacite.org:oai:oai.datacite.org:12425

ListIdentifiers

list_identifiers(from = '2018-05-01T', until = '2018-06-01T')
#> # A tibble: 85 x 5
#>    identifier         datestamp     setSpec             setSpec.1      setSpec.2
#>    <chr>              <chr>         <chr>               <chr>          <chr>    
#>  1 cf7fbc99-de82-41a… 2018-05-31T1… installation:791e3… dataset_type:… country:…
#>  2 09d5405e-ca86-45f… 2018-05-30T1… installation:804b8… dataset_type:… country:…
#>  3 4b64d1f2-31c2-40c… 2018-05-30T1… installation:804b8… dataset_type:… country:…
#>  4 884378d6-d591-476… 2018-05-29T1… installation:a1650… dataset_type:… country:…
#>  5 18799ce9-1a66-40f… 2018-05-14T1… installation:d1b0a… dataset_type:… country:…
#>  6 7e91aacb-c994-41e… 2018-05-21T1… installation:d5b61… dataset_type:… country:…
#>  7 f83746ee-4cf2-4e6… 2018-05-08T0… installation:c4195… dataset_type:… country:…
#>  8 a3533a61-6f88-443… 2018-05-08T1… installation:06d75… dataset_type:… country:…
#>  9 ba9b66a3-2d11-419… 2018-05-05T2… installation:d1b0a… dataset_type:… country:…
#> 10 78b696d9-8f0d-41a… 2018-05-05T2… installation:d1b0a… dataset_type:… country:…
#> # … with 75 more rows

Count Identifiers

count_identifiers()
#>                            url   count
#> 1 http://export.arxiv.org/oai2 1586724

ListRecords

list_records(from = '2018-05-01T', until = '2018-05-15T')
#> # A tibble: 42 x 26
#>    identifier datestamp setSpec setSpec.1 setSpec.2 title publisher identifier.1
#>    <chr>      <chr>     <chr>   <chr>     <chr>     <chr> <chr>     <chr>       
#>  1 18799ce9-… 2018-05-… instal… dataset_… country:… Bird… Sokoine … https://www…
#>  2 f83746ee-… 2018-05-… instal… dataset_… country:… NDFF… Dutch Na… https://www…
#>  3 a3533a61-… 2018-05-… instal… dataset_… country:… EDP … EDP - En… https://www…
#>  4 ba9b66a3-… 2018-05-… instal… dataset_… country:… Ende… Sokoine … https://www…
#>  5 78b696d9-… 2018-05-… instal… dataset_… country:… Ende… Sokoine … https://www…
#>  6 c791b255-… 2018-05-… instal… dataset_… country:… Ende… Sokoine … https://www…
#>  7 b929ccda-… 2018-05-… instal… dataset_… country:… List… Sokoine … https://www…
#>  8 da285c2a-… 2018-05-… instal… dataset_… country:… Moni… Corporac… https://www…
#>  9 87372877-… 2018-05-… instal… dataset_… country:… Moni… Corporac… https://www…
#> 10 ed7d4c25-… 2018-05-… instal… dataset_… country:… Samo… Ministry… https://www…
#> # … with 32 more rows, and 18 more variables: subject <chr>, source <chr>,
#> #   description <chr>, description.1 <chr>, type <chr>, creator <chr>,
#> #   date <chr>, language <chr>, coverage <chr>, coverage.1 <chr>, format <chr>,
#> #   source.1 <chr>, subject.1 <chr>, creator.1 <chr>, coverage.2 <chr>,
#> #   description.2 <chr>, creator.2 <chr>, subject.2 <chr>

GetRecords

ids <- c("87832186-00ea-44dd-a6bf-c2896c4d09b4", "d981c07d-bc43-40a2-be1f-e786e25106ac")
get_records(ids)
#> $`87832186-00ea-44dd-a6bf-c2896c4d09b4`
#> $`87832186-00ea-44dd-a6bf-c2896c4d09b4`$header
#> # A tibble: 1 x 3
#>   identifier              datestamp      setSpec                                
#>   <chr>                   <chr>          <chr>                                  
#> 1 87832186-00ea-44dd-a6b… 2018-06-29T12… installation:729a7375-b120-4e4f-bb81-a…
#> 
#> $`87832186-00ea-44dd-a6bf-c2896c4d09b4`$metadata
#> # A tibble: 0 x 0
#> 
#> 
#> $`d981c07d-bc43-40a2-be1f-e786e25106ac`
#> $`d981c07d-bc43-40a2-be1f-e786e25106ac`$header
#> # A tibble: 1 x 3
#>   identifier              datestamp      setSpec                                
#>   <chr>                   <chr>          <chr>                                  
#> 1 d981c07d-bc43-40a2-be1… 2018-01-21T21… installation:804b8dd0-07ac-4a30-bf92-3…
#> 
#> $`d981c07d-bc43-40a2-be1f-e786e25106ac`$metadata
#> # A tibble: 1 x 12
#>   title  publisher  identifier  subject  source  description type  creator date 
#>   <chr>  <chr>      <chr>       <chr>    <chr>   <chr>       <chr> <chr>   <chr>
#> 1 Peces… Instituto… https://ww… peces, … http:/… Caracteriz… Data… Fernan… 2018…
#> # … with 3 more variables: language <chr>, coverage <chr>, format <chr>

List MetadataFormats

list_metadataformats(id = "87832186-00ea-44dd-a6bf-c2896c4d09b4")
#> $`87832186-00ea-44dd-a6bf-c2896c4d09b4`
#>   metadataPrefix                                                   schema
#> 1         oai_dc           http://www.openarchives.org/OAI/2.0/oai_dc.xsd
#> 2            eml http://rs.gbif.org/schema/eml-gbif-profile/1.0.2/eml.xsd
#>                             metadataNamespace
#> 1 http://www.openarchives.org/OAI/2.0/oai_dc/
#> 2          eml://ecoinformatics.org/eml-2.1.1

List Sets

list_sets("http://api.gbif.org/v1/oai-pmh/registry")
#> # A tibble: 597 x 2
#>    setSpec                     setName         
#>    <chr>                       <chr>           
#>  1 dataset_type                per dataset type
#>  2 dataset_type:OCCURRENCE     occurrence      
#>  3 dataset_type:CHECKLIST      checklist       
#>  4 dataset_type:METADATA       metadata        
#>  5 dataset_type:SAMPLING_EVENT sampling_event  
#>  6 country                     per country     
#>  7 country:AD                  Andorra         
#>  8 country:AM                  Armenia         
#>  9 country:AO                  Angola          
#> 10 country:AR                  Argentina       
#> # … with 587 more rows

Examples of other OAI providers

Biodiversity Heritage Library

Identify

id("http://www.biodiversitylibrary.org/oai")
#>                                 repositoryName
#> 1 Biodiversity Heritage Library OAI Repository
#>                                   baseURL protocolVersion
#> 1 https://www.biodiversitylibrary.org/oai             2.0
#>                    adminEmail earliestDatestamp deletedRecord granularity
#> 1 [email protected]        2006-01-01            no  YYYY-MM-DD
#>                                                        description
#> 1 oaibiodiversitylibrary.org:oai:biodiversitylibrary.org:item/1000

Get records

get_records(c("oai:biodiversitylibrary.org:item/7", "oai:biodiversitylibrary.org:item/9"),
            url = "http://www.biodiversitylibrary.org/oai")
#> $`oai:biodiversitylibrary.org:item/7`
#> $`oai:biodiversitylibrary.org:item/7`$header
#> # A tibble: 1 x 3
#>   identifier                         datestamp            setSpec
#>   <chr>                              <chr>                <chr>  
#> 1 oai:biodiversitylibrary.org:item/7 2016-01-26T06:05:19Z item   
#> 
#> $`oai:biodiversitylibrary.org:item/7`$metadata
#> # A tibble: 1 x 10
#>   title   creator  subject  description  publisher contributor type  identifier 
#>   <chr>   <chr>    <chr>    <chr>        <chr>     <chr>       <chr> <chr>      
#> 1 Die Mu… Fleisch… Bogor;I… pt.5:v.1 (1… Leiden :… Missouri B… text… https://ww…
#> # … with 2 more variables: language <chr>, rights <chr>
#> 
#> 
#> $`oai:biodiversitylibrary.org:item/9`
#> $`oai:biodiversitylibrary.org:item/9`$header
#> # A tibble: 1 x 3
#>   identifier                         datestamp            setSpec
#>   <chr>                              <chr>                <chr>  
#> 1 oai:biodiversitylibrary.org:item/9 2016-01-26T06:05:19Z item   
#> 
#> $`oai:biodiversitylibrary.org:item/9`$metadata
#> # A tibble: 1 x 10
#>   title   creator  subject  description  publisher contributor type  identifier 
#>   <chr>   <chr>    <chr>    <chr>        <chr>     <chr>       <chr> <chr>      
#> 1 Die Mu… Fleisch… Bogor;I… pt.5:v.3 (1… Leiden :… Missouri B… text… https://ww…
#> # … with 2 more variables: language <chr>, rights <chr>

Acknowledgements

Michal Bojanowski thanks National Science Centre for support through grant 2012/07/D/HS6/01971.

Meta

  • Please report any issues or bugs.
  • License: MIT
  • Get citation information for oai in R doing citation(package = 'oai')
  • Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].